简体   繁体   English

检测事件时间会话窗口的结束(Apache Flink Java)

[英]Detect end of event time session window (Apache Flink Java)

Assuming all events arrive on time and no lateness is allowed, how do I do some processing only when the session window has ended?假设所有事件都准时到达并且不允许迟到,我如何仅在会话窗口结束时进行一些处理? Ie the watermark has passed ( lastEventInWindowTimestamp + inactivityGap ).即水印已经通过( lastEventInWindowTimestamp + inactivityGap )。 I couldn't find any API method that is called when this happens.我找不到发生这种情况时调用的任何 API 方法。 Can I implement this logic using a custom ProcessWindowFunction ?我可以使用自定义ProcessWindowFunction实现此逻辑吗?

Yes, a ProcessWindowFunction serves exactly this purpose.是的, ProcessWindowFunction正是为了这个目的。 Such a function is called when the window is complete, and is passed (among other things) an Iterable containing the stream elements that have been assigned to the window.当窗口完成时调用这样的函数,并传递(除其他外)一个包含已分配给窗口的流元素的 Iterable。 In the case of a session window, the ProcessWindowFunction isn't called until after the period of inactivity has passed.在会话窗口的情况下, ProcessWindowFunction直到不活动期过后才会被调用。

Update: How can you report both the start and end timestamps for each session window?更新:您如何报告每个会话窗口的开始和结束时间戳?

I will assume that you can extract the timestamp for each event from the event itself.我假设您可以从事件本身中提取每个事件的时间戳。 Then, if you are using a ProcessWindowFunction , you can iterate over the events in the window and determine the min and max timestamps for the events in the session -- these will be the start and end timestamps.然后,如果您使用ProcessWindowFunction ,您可以遍历窗口中的事件并确定会话中事件的最小和最大时间戳——这些将是开始和结束时间戳。

If, on the other hand, you would rather use a reduce function that incrementally computes the window results, you can work with tuples that track the (min, max) timestamps for each window.另一方面,如果您更愿意使用增量计算窗口结果的 reduce 函数,您可以使用跟踪每个窗口的(最小、最大)时间戳的元组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM