简体   繁体   中英

Apache Storm Sliding Window in Realtime

I use sliding window technique of Apache Storm to get data with 24 hour window length and 1 hour sliding interval length. When first TupleWindow arrive, an aggregation process start for every tuple in TupleWindow.

My aggregation process waits until next TupleWindow arrive. As soon as TupleWindow arrive, aggregation process start and consumes lots of source. I wonder that if Apache Storm has ability to send data in real time(not waiting until last item of window arrive). In this case I can aggregate everything in real time

Are there any configuration for that ?

Thanks

Right now theres no way to incrementally compute aggregates before a window triggers. Storm allows you to access the new events that arrived since the last window ( Window.getNew ) and the events that expired since the last window Window.getExpired . You could use this to optimize the aggregate computation by computing just the delta when the window triggers.

The other option would be to use a count base sliding interval to trigger the window after every 'n' events to keep the events that you have to process manageable and then use a similar approach.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM