简体   繁体   中英

Spark Streaming Kafka Stream batch execution

I'm new in spark streaming and I have a general question relating to its usage. I'm currently implementing an application which streams data from a Kafka topic.

Is it a common scenario to use the application to run a batch only one time, for example, an end of the day, collecting all the data from the topic, do some aggregation and transformation and so on?

That means after starting the app with spark-submit all this stuff will be performed in one batch and then the application would be shut down. Or is spark stream build for running endless and permanently stream data in continuous batches?

You can use kafka-stream api, and fix a window-time to perform aggregation and transformation over events in your topic only one batch at a time. for move information about windowing check this https://kafka.apache.org/21/documentation/streams/developer-guide/dsl-api.html#windowing

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM