[英]Count Elements Inside Apache Spark DStream
I need to retrive number of element inside a DStream using Java.我需要使用 Java 检索 DStream 中的元素数量。 Reading documentation I have do something like the following:
阅读文档我做了如下的事情:
JavaDStream<Object> stream;
stream.count()
It return a DStream object instead of a number它返回一个 DStream 对象而不是一个数字
How Can I get the amount of elements in DStream?如何获取 DStream 中的元素数量? I need it in a test suite
我在测试套件中需要它
You cannot.你不能。
DStream
represents an infinite sequence of RDDs so it is not really meaningful to ask about the total number of elements. DStream
表示无限的 RDD 序列,因此询问元素总数并没有什么意义。
You can add stateful operations which will keep track of the number of values and update it by window but it is not the same as asking for count over the stream.您可以添加有状态操作来跟踪值的数量并按窗口更新它,但这与要求对流进行计数不同。 You can check
MapWithStateSuite
to see how testing state can be implemented.您可以检查
MapWithStateSuite
以了解如何实现测试状态。
val count =topNUrl.foreachRDD { rdd =>
rdd.count()
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.