简体   繁体   English

Spark流媒体和模拟HDFS

[英]Spark streaming and mocking hdfs

There is a requirement to implement a test for a spark streaming code. 需要对Spark Streaming代码实施测试。 This particular code is running in a separate jvm by using this library And the input for above application is hdfs. 使用此库 ,此特定代码在单独的jvm中运行。上述应用程序的输入为hdfs。 I've started MiniDFSCluster like in this example (java version) But i don't think it will work because these are in two different JVMs. 我已经像本例(Java版本)中那样启动了MiniDFSCluster,但是我认为这不会起作用,因为它们位于两个不同的JVM中。

What would be the best approach to mock the hdfs input if i were to successfully test the spark streaming code. 如果我要成功测试Spark Streaming代码,那么模拟hdfs输入的最佳方法是什么。

I explained above scenario generally. 我已经大致解释了上述情况。 The real requirement is to implement a successful cucumber test. 真正的要求是实施成功的黄瓜测试。

可以尝试在本地模式下运行Spark并指定诸如“ file:/// foo / bar”之类的路径,而不是尝试模拟hdfs-然后将使用本地文件系统代替hdfs。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM