简体   繁体   English

带时间戳的 Flink 计数器

[英]Flink counter with timestamp

I was reading the the Flink example CountWithTimestamp and below is a code snippet from the example:我正在阅读 Flink 示例 CountWithTimestamp,下面是示例中的代码片段:

  @Override
    public void processElement(Tuple2<String, String> value, Context ctx, Collector<Tuple2<String, Long>> out)
            throws Exception {

        // retrieve the current count
        CountWithTimestamp current = state.value();
        if (current == null) {
            current = new CountWithTimestamp();
            current.key = value.f0;
        }

        // update the state's count
        current.count++;

        // set the state's timestamp to the record's assigned event time timestamp
        current.lastModified = ctx.timestamp();

        // write the state back
        state.update(current);

        // schedule the next timer 60 seconds from the current event time
        ctx.timerService().registerEventTimeTimer(current.lastModified + 60000);
    }

    @Override
    public void onTimer(long timestamp, OnTimerContext ctx, Collector<Tuple2<String, Long>> out)
            throws Exception {

        // get the state for the key that scheduled the timer
        CountWithTimestamp result = state.value();

        // check if this is an outdated timer or the latest timer
        if (timestamp == result.lastModified + 60000) {
            // emit the state on timeout
            out.collect(new Tuple2<String, Long>(result.key, result.count));
        }
    }
}

My question is that if I remove the if statment timestamp == result.lastModified + 60000 (collect stmt not touched) in the onTimer, and instead replace it by another if statment if(ctx.timestamp < current.lastModified + 60000) { deleteEventTimeTimer(current.lastModified + 60000)} in the begining of processElement, would the semnatics of the program be the same?我的问题是,如果我删除 onTimer 中的 if statment timestamp == result.lastModified + 60000 (collect stmt not touch),而是用另一个 if statment if(ctx.timestamp < current.lastModified + 60000) { deleteEventTimeTimer(current.lastModified + 60000)}在processElement的开头,程序的语义是否相同? any preference of one version over the other in case of same semantics?在语义相同的情况下,对一个版本的任何偏好?

You are correct to think that the implementation that deletes the timer has the same semantics.您认为删除计时器的实现具有相同的语义是正确的。 And in fact I recently changed the example used in our training materials to do just that, as I prefer this approach.事实上,我最近更改了我们培训材料中使用的示例来做到这一点,因为我更喜欢这种方法。 The reason I find it preferable is that all of the complex business logic is then in one place (in processElement ), and whenever onTimer is called, you know exactly what to do, no questions asked.我认为它更可取的原因是所有复杂的业务逻辑都集中在一个地方(在processElement ),并且每当onTimer被调用时,您确切地知道要做什么,不会问任何问题。 Plus, it's more performant, as there are fewer timers to checkpoint and eventually trigger.此外,它的性能更高,因为检查点并最终触发的计时器更少。

This example was written for the docs back before timers could be deleted, and hasn't been updated.此示例是在可以删除计时器之前为文档编写的,并且尚未更新。

You can find the reworked example I mentioned in these slides -- https://training.ververica.com/decks/process-function/ -- once you get past the registration page.您可以找到我在这些幻灯片中提到的重新设计的示例 -- https://training.ververica.com/decks/process-function/ -- 一旦您通过注册页面。

FWIW, I also recently reworked the reference solution to the corresponding training exercise along the same lines: https://github.com/apache/flink-training/tree/master/long-ride-alerts . FWIW,我最近还按照相同的思路重新编写了相应训练练习的参考解决方案: https : //github.com/apache/flink-training/tree/master/long-ride-alerts

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM