简体   繁体   中英

Clear Flink watermark state in DataStream

Is it possible to clear the current watermark in a DataStream?

Example input for a month-long watermark with no allowed lateness:

[
  { timestamp: '10/2018' },
  { timestamp: '11/2018' },
  { timestamp: '11/2018', clearState: true },
  { timestamp: '9/2018' }
]

Normally, the '9/2018' record would be thrown out as it is late. Is there a way to programmatically reset the watermark state when the clearState message is seen?

Watermarks are not supposed to go backwards -- it's undefined what will happen, and in practice it's a bad idea. There are, however, various ways to accommodate late data.

If you are using the window API, Flink will clear any window state once the allowed lateness has expired for a window. If you want more control than this, consider using a ProcessFunction , which will allow/require you to manage state (and timers) explicitly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM