[英]Flink watermark is negative
I was trying to assign timestamp and watermarks to a stream by implementing the AssignerWithPeriodicWatermarks
, inside the function, it implements:我试图通过在函数内部实现
AssignerWithPeriodicWatermarks
来为流分配时间戳和水印,它实现:
override def getCurrentWatermark: Watermark = {
// this guarantees that the watermark never goes backwards.
val potentialWM = currentMaxTimestamp - maxOutOfOrderness
if (potentialWM >= lastEmittedWatermark) lastEmittedWatermark = potentialWM
new Watermark(lastEmittedWatermark)
}
override def extractTimestamp(element: T, previousElementTimestamp: Long): Long = {
val timestamp = element.streamTime // something exists in the stream
if (timestamp > currentMaxTimestamp) currentMaxTimestamp = timestamp
timestamp
}
However, I still got watermarks of the default value -9223372036854775808
, and when I tried to add printing inside both functions, I found only println
inside extractTimestamp
was printed, which is saying function of getCurrentWatermark
was never called.但是,我仍然得到默认值
-9223372036854775808
水印,当我尝试在这两个函数中添加打印时,我发现仅打印了extractTimestamp
中的println
, extractTimestamp
从未调用过getCurrentWatermark
函数。
The implementations seem to be right, because the same code was able to run on another script(some code not written by me).实现似乎是正确的,因为相同的代码能够在另一个脚本上运行(有些代码不是我写的)。
PS: It's not the first time that I encountered negative watermark, what I found is that after a certain period of time, the watermark will go positive, however I am still quite confused what happened at the beginning. PS:我不是第一次遇到负水印了,我发现经过一段时间后,水印会变成正水印,但是我仍然很困惑一开始发生了什么。
The issue is that You are using AssignerWithPeriodicWatermark
which does not generate watermarks per event but in intervals.问题是您使用的
AssignerWithPeriodicWatermark
不会按事件生成水印,而是按时间间隔生成水印。 Whenever You are using AssingerWithPeriodicWatermark
You should set call the setTheAutowatermarkInterval
on the execution environment.每当您使用
AssingerWithPeriodicWatermark
您都应该在执行环境中设置调用setTheAutowatermarkInterval
。 The value that You provide there will be the interval that the getCurrentWatermark
will be called with.您提供的值将是调用
getCurrentWatermark
的时间间隔。 If You haven't set it then the method will never be called, thus You will never have a watermark changed.如果你没有设置它,那么该方法将永远不会被调用,因此你永远不会改变水印。 For testing and learning, You can consider using
AssignerWithPunctuatedWatermark
as this will simply emit watermark for each event.对于测试和学习,您可以考虑使用
AssignerWithPunctuatedWatermark
因为这将简单地为每个事件发出水印。
EDIT: As it was mentioned below this answer, the default value for autowatermarkInterval
is actually 200 ms.编辑:正如在这个答案
autowatermarkInterval
, autowatermarkInterval
的默认值实际上是 200 毫秒。 Also, using the AssignerWithPunctuatedWatermark
doesn't mean You need to emit the Watermark for each event, but the method for emitting them will be called for each event.此外,使用
AssignerWithPunctuatedWatermark
并不意味着您需要为每个事件发出 Watermark,但会为每个事件调用发出它们的方法。 If You don't want to emit the Watermark then the method should simply return null
.如果您不想发出水印,则该方法应简单地返回
null
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.