[英]Spark streaming and mocking hdfs
There is a requirement to implement a test for a spark streaming code. 需要对Spark Streaming代码实施测试。 This particular code is running in a separate jvm by using this library And the input for above application is hdfs.
使用此库 ,此特定代码在单独的jvm中运行。上述应用程序的输入为hdfs。 I've started MiniDFSCluster like in this example (java version) But i don't think it will work because these are in two different JVMs.
我已经像本例(Java版本)中那样启动了MiniDFSCluster,但是我认为这不会起作用,因为它们位于两个不同的JVM中。
What would be the best approach to mock the hdfs input if i were to successfully test the spark streaming code. 如果我要成功测试Spark Streaming代码,那么模拟hdfs输入的最佳方法是什么。
I explained above scenario generally. 我已经大致解释了上述情况。 The real requirement is to implement a successful cucumber test.
真正的要求是实施成功的黄瓜测试。
可以尝试在本地模式下运行Spark并指定诸如“ file:/// foo / bar”之类的路径,而不是尝试模拟hdfs-然后将使用本地文件系统代替hdfs。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.