简体   繁体   中英

Trigger elements exactly once using Fixed Windowing with Apache Beam

I'm reading data from Google pub-sub and windowing them into fixed window of 5 minutes. But - the data is not triggered correctly. I've tried multiple combinations, nothing seems to work. This looks something fairly simple - but I'm unable to get it right.

Use case -

  1. Read data from pub-sub
  2. Window them into 5 minutes
  3. Perform aggregations after the end of the 5 minutes window.
  4. AllowedLateness period of 1 day.

Attempt(s):

1.Using AfterWatermark.pastEndOfWindow to trigger. This doesn't produce any output at all. There were about 1000 messages read from the subscription but no messages was outputted by the window.

Window.<EventModel>into(
                FixedWindows.of(Duration.standardMinutes(5)))
                .triggering(AfterWatermark.pastEndOfWindow())
                .withAllowedLateness(Duration.standardDays(1), Window.ClosingBehavior.FIRE_ALWAYS)
                .discardingFiredPanes();

2.Using Global windowing: This works correctly. But this uses GlobalWindows - but I need to implement Fixed Windowing.

Window<EventModel> window = Window.<OrderEvent>
                into(new GlobalWindows())
                .triggering(
                        Repeatedly.forever( 
              AfterProcessingTime.pastFirstElementInPane().plusDelayOf(Duration.standardMinutes(5))))
                .discardingFiredPanes()
                .withAllowedLateness(Duration.standardDays(1));

I've attempted other combinations which use - Early or Late Firings - which trigger some elements but not fit my use case - I don't need early or late firings - just need results once every 5 minutes.

Any input would be really helpful, I've invested way too much time in this with no luck.

Found the issue:

It was bug with DirectRunner. For some reason - direct runner was not advancing the watermark and hence nothing was triggered.

The below code worked correctly - with Dataflow Runner - elements were triggered after the end of the window.

Window<MyModel> window = Window.<MyModel>into(FixedWindows.of(Duration.standardMinutes(10)))
                    .triggering(Repeatedly.forever(AfterWatermark.pastEndOfWindow()))
                    .withAllowedLateness(Duration.standardDays(1))
                    .discardingFiredPanes();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM