[英]Trigger elements exactly once using Fixed Windowing with Apache Beam
I'm reading data from Google pub-sub and windowing them into fixed window of 5 minutes.我正在从 Google pub-sub 读取数据并将它们窗口化为 5 分钟的固定 window。 But - the data is not triggered correctly.
但是 - 数据未正确触发。 I've tried multiple combinations, nothing seems to work.
我尝试了多种组合,似乎没有任何效果。 This looks something fairly simple - but I'm unable to get it right.
这看起来相当简单 - 但我无法正确处理。
Use case -用例 -
Attempt(s):尝试:
1.Using AfterWatermark.pastEndOfWindow to trigger. 1.使用AfterWatermark.pastEndOfWindow触发。 This doesn't produce any output at all.
这根本不会产生任何 output。 There were about 1000 messages read from the subscription but no messages was outputted by the window.
从订阅中读取了大约 1000 条消息,但 window 没有输出消息。
Window.<EventModel>into(
FixedWindows.of(Duration.standardMinutes(5)))
.triggering(AfterWatermark.pastEndOfWindow())
.withAllowedLateness(Duration.standardDays(1), Window.ClosingBehavior.FIRE_ALWAYS)
.discardingFiredPanes();
2.Using Global windowing: This works correctly. 2.使用全局窗口:这工作正常。 But this uses GlobalWindows - but I need to implement Fixed Windowing.
但这使用 GlobalWindows - 但我需要实现固定窗口。
Window<EventModel> window = Window.<OrderEvent>
into(new GlobalWindows())
.triggering(
Repeatedly.forever(
AfterProcessingTime.pastFirstElementInPane().plusDelayOf(Duration.standardMinutes(5))))
.discardingFiredPanes()
.withAllowedLateness(Duration.standardDays(1));
I've attempted other combinations which use - Early or Late Firings - which trigger some elements but not fit my use case - I don't need early or late firings - just need results once every 5 minutes.我尝试了其他组合使用 - Early 或 Late Firings - 触发一些元素但不适合我的用例 - 我不需要提前或延迟触发 - 只需要每 5 分钟一次的结果。
Any input would be really helpful, I've invested way too much time in this with no luck.任何输入都会非常有帮助,我在这方面投入了太多时间,但没有运气。
Found the issue:发现问题:
It was bug with DirectRunner.这是 DirectRunner 的错误。 For some reason - direct runner was not advancing the watermark and hence nothing was triggered.
出于某种原因 - 直接跑步者没有推进水印,因此没有触发任何事情。
The below code worked correctly - with Dataflow Runner - elements were triggered after the end of the window.以下代码正常工作 - 使用 Dataflow Runner - 在 window 结束后触发元素。
Window<MyModel> window = Window.<MyModel>into(FixedWindows.of(Duration.standardMinutes(10)))
.triggering(Repeatedly.forever(AfterWatermark.pastEndOfWindow()))
.withAllowedLateness(Duration.standardDays(1))
.discardingFiredPanes();
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.