简体   繁体   中英

Count Elements Inside Apache Spark DStream

I need to retrive number of element inside a DStream using Java. Reading documentation I have do something like the following:

JavaDStream<Object> stream;

stream.count()

It return a DStream object instead of a number

How Can I get the amount of elements in DStream? I need it in a test suite

You cannot. DStream represents an infinite sequence of RDDs so it is not really meaningful to ask about the total number of elements.

You can add stateful operations which will keep track of the number of values and update it by window but it is not the same as asking for count over the stream. You can check MapWithStateSuite to see how testing state can be implemented.

val count =topNUrl.foreachRDD { rdd => 
                      rdd.count()
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM