简体   繁体   English

使用 Apache Beam 的 Fixed Windowing 仅触发一次元素

[英]Trigger elements exactly once using Fixed Windowing with Apache Beam

I'm reading data from Google pub-sub and windowing them into fixed window of 5 minutes.我正在从 Google pub-sub 读取数据并将它们窗口化为 5 分钟的固定 window。 But - the data is not triggered correctly.但是 - 数据未正确触发。 I've tried multiple combinations, nothing seems to work.我尝试了多种组合,似乎没有任何效果。 This looks something fairly simple - but I'm unable to get it right.这看起来相当简单 - 但我无法正确处理。

Use case -用例 -

  1. Read data from pub-sub从 pub-sub 读取数据
  2. Window them into 5 minutes Window 把它们变成5分钟
  3. Perform aggregations after the end of the 5 minutes window.在 5 分钟 window 结束后执行聚合。
  4. AllowedLateness period of 1 day. AllowedLateness 1 天。

Attempt(s):尝试:

1.Using AfterWatermark.pastEndOfWindow to trigger. 1.使用AfterWatermark.pastEndOfWindow触发。 This doesn't produce any output at all.这根本不会产生任何 output。 There were about 1000 messages read from the subscription but no messages was outputted by the window.从订阅中读取了大约 1000 条消息,但 window 没有输出消息。

Window.<EventModel>into(
                FixedWindows.of(Duration.standardMinutes(5)))
                .triggering(AfterWatermark.pastEndOfWindow())
                .withAllowedLateness(Duration.standardDays(1), Window.ClosingBehavior.FIRE_ALWAYS)
                .discardingFiredPanes();

2.Using Global windowing: This works correctly. 2.使用全局窗口:这工作正常。 But this uses GlobalWindows - but I need to implement Fixed Windowing.但这使用 GlobalWindows - 但我需要实现固定窗口。

Window<EventModel> window = Window.<OrderEvent>
                into(new GlobalWindows())
                .triggering(
                        Repeatedly.forever( 
              AfterProcessingTime.pastFirstElementInPane().plusDelayOf(Duration.standardMinutes(5))))
                .discardingFiredPanes()
                .withAllowedLateness(Duration.standardDays(1));

I've attempted other combinations which use - Early or Late Firings - which trigger some elements but not fit my use case - I don't need early or late firings - just need results once every 5 minutes.我尝试了其他组合使用 - Early 或 Late Firings - 触发一些元素但不适合我的用例 - 我不需要提前或延迟触发 - 只需要每 5 分钟一次的结果。

Any input would be really helpful, I've invested way too much time in this with no luck.任何输入都会非常有帮助,我在这方面投入了太多时间,但没有运气。

Found the issue:发现问题:

It was bug with DirectRunner.这是 DirectRunner 的错误。 For some reason - direct runner was not advancing the watermark and hence nothing was triggered.出于某种原因 - 直接跑步者没有推进水印,因此没有触发任何事情。

The below code worked correctly - with Dataflow Runner - elements were triggered after the end of the window.以下代码正常工作 - 使用 Dataflow Runner - 在 window 结束后触发元素。

Window<MyModel> window = Window.<MyModel>into(FixedWindows.of(Duration.standardMinutes(10)))
                    .triggering(Repeatedly.forever(AfterWatermark.pastEndOfWindow()))
                    .withAllowedLateness(Duration.standardDays(1))
                    .discardingFiredPanes();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用Apache Beam进行窗口化 - 固定Windows似乎不会关闭? - Windowing with Apache Beam - Fixed Windows Don't Seem to be Closing? Apache Beam Session 跨 PCollection 开窗和连接 - Apache Beam Session Windowing and joining across PCollections Apache 光束窗口:考虑晚期数据但只发出一个窗格 - Apache beam windowing: consider late data but emit only one pane 连接两个数据流的正确 Apache 波束窗口策略 - Correct Apache beam windowing strategy for joining two streams of data DataFlow (Apache Beam) 中发布/订阅的自定义时间戳和窗口 - Custom timestamp and windowing for Pub/Sub in DataFlow (Apache Beam) Apache Beam/Java,如何设置每个 window 仅发送一次数据的窗口/触发器 - Apache Beam/Java, how to set window/trigger that sends the data only once for each window Apache 独特元素的光束数 - Apache Beam count of unique elements 如何使用 Apache Kafka 实现“Exactly once” kafka 消费者? - How to implement “Exactly once” kafka consumer using Apache Kafka? (Apache Beam) 无法增加执行器内存 - 尽管使用了多个设置,它还是固定在 1024M - (Apache Beam) Cannot increase executor memory - it is fixed at 1024M despite using multiple settings 使用 Apache 光束的 ZetaSQL 示例 - ZetaSQL Sample Using Apache beam
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM