简体   繁体   中英

Passing elements back to the input stream, after processing, in Flink?

Scenario:

I have a stream of events coming from the sensor. The Event could be of T-type or J-Type .

  • T-type events have event occurred timestamp.
  • J-type events have a start and end timestamp.

Based on the start and end timestamp of J-Type event, apply aggregation logic on all the T-type events that fall within the time range and write the result to a DB.

For this, I have created a custom trigger, which gets triggered when a J-Type event is received. In my custom ProcessWindowFunction, I am performing the aggregation logic and time check.

But, there could be a scenario, where T-type event doesn't fall in the time range of the current J-Type event. In that case, the T-type event should be pushed to the next window before purging the current window.

流窗口

Thought of Solutions:

  1. Push the unprocessed T-type events into the Kinesis stream (the source), in the custom window process function. (Worst case solution)

  2. Instead of FIRE_AND_PURGE, use FIRE, to maintain the state throughout the runtime. Remove processed elements using the elements Iterator. ( Not recommended, to keep an infinite window)

Would like to know, if there is any way to directly push the un-processed events back to the input stream (without kinesis). (Re-queuing)

Or

Is there any way to maintain state in the keyBy Context, so that, we perform computation on these unprocessed data, (before or)along with the window elements.

Here are two solutions. They are more-or-less equivalent in their underlying behavior, but you might find one or the other easier to understand, maintain, or test.

As for your question, no, there is no way to loop back (re-queue) the unconsumed events without pushing them back to Kinesis. But simply holding on to them until they are needed should be fine.

Solution 1: Use a RichFlatMapFunction

As T-type events arrive, append them to a ListState object. When a J-type event arrives, collect to the output all matching T-type events from the list, and update the list to only retain those T-type events that will belong to later J-type events.

Solution 2: Use GlobalWindows with a custom Trigger and Evictor

In addition to what you've already done, implement an Evictor that (after the window has been FIREd) removes only the J-type event and all matching T-type events from the window.

Update: Clearing State for Stale Keys / Dead Sensors

With solution 1, you can use state TTL to arrange for any inactive state associated with dead keys to be purged. Or you could use a KeyedProcessFunction rather than a RichFlatMapFunction , and use timers to accomplish the same thing.

Managing state for stale keys with the window API can be less straightforward, but for solution 2 I believe you can extend your custom trigger to include a timeout that will PURGE the window. And if you have used global state in the ProcessWindowFunction , you will need to rely on state TTL to clean that up.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM