简体   繁体   English

kafka flink 时间戳事件时间和水印

[英]kafka flink timestamp Event time and watermark

I am reading the book Stream Processing with Apache Flink and it is stated that “As of version 0.10.0, Kafka supports message timestamps.我正在阅读《使用 Apache Flink 进行流处理》一书,其中指出“从 0.10.0 版开始,Kafka 支持消息时间戳。 When reading from Kafka version 0.10 or later, the consumer will automatically extract the message timestamp as an event-time timestamp if the application runs in event-time mode*” So inside a processElement function the call context.timestamp() will by default return the kafka message timestamp?从 Kafka 0.10 或更高版本读取时,如果应用程序以事件时间模式运行*,消费者将自动提取消息时间戳作为事件时间时间戳*”因此在processElement函数中,调用context.timestamp()将默认返回kafka 消息时间戳? Coul you please provide a simple example on how to implement AssignerWithPeriodicWatermarks/AssignerWithPunctuatedWatermarks that extract (and builds watermarks) based on the consumed kafka message timestamp.您能否提供一个简单的示例,说明如何实现根据使用的 kafka 消息时间戳提取(并构建水印)的 AssignerWithPeriodicWatermarks/AssignerWithPunctuatedWatermarks。

If I am using TimeCharacteristic.ProcessingTime , would ctx.timestamp() return the processing time and in such case would it be similar to context.timerService().currentProcessingTime() .如果我使用TimeCharacteristic.ProcessingTime , ctx.timestamp() 会返回处理时间,在这种情况下它会类似于context.timerService().currentProcessingTime()

Thank you.谢谢你。

The Flink Kafka consumer takes care of this for you, and puts the timestamp where it needs to be. Flink Kafka 消费者会为你处理这个问题,并将时间戳放在需要的地方。 In Flink 1.11 you can simply rely on this, though you still need to take care of providing a WatermarkStrategy that specifies the out-of-orderness (or asserts that the timestamps are in order):在 Flink 1.11 中,您可以简单地依赖它,但您仍然需要注意提供一个 WatermarkStrategy 来指定乱序(或断言时间戳是有序的):

FlinkKafkaConsumer<String> myConsumer = new FlinkKafkaConsumer<>(...);
myConsumer.assignTimestampsAndWatermarks(
    WatermarkStrategy.
        .forBoundedOutOfOrderness(Duration.ofSeconds(20)));

In earlier versions of Flink you had to provide an implementation of a timestamp assigner, which would look like this:在早期版本的 Flink 中,您必须提供时间戳分配器的实现,如下所示:

public long extractTimestamp(Long element, long previousElementTimestamp) {
    return previousElementTimestamp;
}

This version of the extractTimestamp method is passed the current value of the timestamp present in the StreamRecord as previousElementTimestamp , which in this case will be the timestamp put there by the Flink Kafka consumer.此版本的extractTimestamp方法将 StreamRecord 中存在的时间戳的当前值作为previousElementTimestamp传递,在这种情况下,它将是 Flink Kafka 消费者放置在那里的时间戳。

Flink 1.11 docs Flink 1.11 文档
Flink 1.10 docs Flink 1.10 文档

As for what is returned by ctx.timestamp() when using TimeCharacteristic.ProcessingTime , this method returns NULL in that case.至于ctx.timestamp()在使用TimeCharacteristic.ProcessingTime时返回的内容,在这种情况下此方法返回 NULL。 (Semantically, yes, it is as though the timestamp is the current processing time, but that's not how it's implemented.) (从语义上讲,是的,就好像时间戳是当前处理时间一样,但这不是它的实现方式。)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Flink Windows边界,水印,事件时间戳和处理时间 - Flink Windows Boundaries, Watermark, Event Timestamp & Processing Time Apache Flink - SQL Kafka 连接器事件时间上的水印不提取记录 - Apache Flink - SQL Kafka connector Watermark on event time doesn't pull records Apache Flink,事件时间聚合-水印是否与聚合密钥相关联? - Apache Flink, Event Time Aggregation - does Watermark associated with Aggregation Key? Flink WaterMark 和触发器 - 未在事件时间丢弃后期元素? - Flink WaterMark And Triggers - Late elements not discarded on event time? 使用时间分散的事件在flink中管理水印的正确方法 - Proper way to manage watermark in flink with event scattered in time Flink 中关于事件时间处理的水印是什么? 为什么需要它。? - What is a watermark in Flink with respect to Event time processing? Why is it needed.? 水印策略不适用于 Flink 中的 Kafka Consumer - Watermark strategy not working for Kafka Consumer in Flink Apache Flink:将事件时间与多个Kafka分区一起使用时无输出 - Apache Flink: No output when using event time with multiple Kafka partitions 带有窗口处理器(事件时间窗口)和 Kafka 源的 Flink 缺失事件 - Flink Missing Events With Windowed Processor(Event Time Windows) and Kafka Source 当水印小于窗口结尾时,将触发基于事件时间的Flink窗口操作 - Flink window operation based on event time is triggered when watermark is less than the end of window ends
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM