简体   繁体   English

Flink水印为负

[英]Flink watermark is negative

I was trying to assign timestamp and watermarks to a stream by implementing the AssignerWithPeriodicWatermarks , inside the function, it implements:我试图通过在函数内部实现AssignerWithPeriodicWatermarks来为流分配时间戳和水印,它实现:

override def getCurrentWatermark: Watermark = {
    // this guarantees that the watermark never goes backwards.
    val potentialWM = currentMaxTimestamp - maxOutOfOrderness
    if (potentialWM >= lastEmittedWatermark) lastEmittedWatermark = potentialWM

    new Watermark(lastEmittedWatermark)
  }

  override def extractTimestamp(element: T, previousElementTimestamp: Long): Long = {
    val timestamp = element.streamTime // something exists in the stream
    if (timestamp > currentMaxTimestamp) currentMaxTimestamp = timestamp
    timestamp
  }

However, I still got watermarks of the default value -9223372036854775808 , and when I tried to add printing inside both functions, I found only println inside extractTimestamp was printed, which is saying function of getCurrentWatermark was never called.但是,我仍然得到默认值-9223372036854775808水印,当我尝试在这两个函数中添加打印时,我发现仅打印了extractTimestamp中的printlnextractTimestamp从未调用过getCurrentWatermark函数。

The implementations seem to be right, because the same code was able to run on another script(some code not written by me).实现似乎是正确的,因为相同的代码能够在另一个脚本上运行(有些代码不是我写的)。

PS: It's not the first time that I encountered negative watermark, what I found is that after a certain period of time, the watermark will go positive, however I am still quite confused what happened at the beginning. PS:我不是第一次遇到负水印了,我发现经过一段时间后,水印会变成正水印,但是我仍然很困惑一开始发生了什么。

The issue is that You are using AssignerWithPeriodicWatermark which does not generate watermarks per event but in intervals.问题是您使用的AssignerWithPeriodicWatermark不会按事件生成水印,而是按时间间隔生成水印。 Whenever You are using AssingerWithPeriodicWatermark You should set call the setTheAutowatermarkInterval on the execution environment.每当您使用AssingerWithPeriodicWatermark您都应该在执行环境中设置调用setTheAutowatermarkInterval The value that You provide there will be the interval that the getCurrentWatermark will be called with.您提供的值将是调用getCurrentWatermark的时间间隔。 If You haven't set it then the method will never be called, thus You will never have a watermark changed.如果你没有设置它,那么该方法将永远不会被调用,因此你永远不会改变水印。 For testing and learning, You can consider using AssignerWithPunctuatedWatermark as this will simply emit watermark for each event.对于测试和学习,您可以考虑使用AssignerWithPunctuatedWatermark因为这将简单地为每个事件发出水印。

EDIT: As it was mentioned below this answer, the default value for autowatermarkInterval is actually 200 ms.编辑:正如在这个答案autowatermarkIntervalautowatermarkInterval的默认值实际上是 200 毫秒。 Also, using the AssignerWithPunctuatedWatermark doesn't mean You need to emit the Watermark for each event, but the method for emitting them will be called for each event.此外,使用AssignerWithPunctuatedWatermark并不意味着您需要为每个事件发出 Watermark,但会为每个事件调用发出它们的方法。 If You don't want to emit the Watermark then the method should simply return null .如果您不想发出水印,则该方法应简单地返回null

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM