简体   繁体   English

Flink Sliding窗口无法正常工作

[英]Flink Sliding window does not work as expect

I have a datastream 1,2,3,4,5,6..... 我有一个数据流1,2,3,4,5,6 .....

Am applying sliding countWindow as below 我正在应用如下滑动countWindow

inputStream.keyBy("id").countWindow( 2,1); inputStream.keyBy(“id”)。countWindow(2,1);

Expected Output 预期产出

1,2 1,2

2,3 2,3

3,4 .. 3,4 ..

Actual Output 实际产出

1 1

1,2 1,2

2,3 2,3

3,4 3,4

Why does it slide first before accumulating the window size 为什么在累积窗口大小之前先滑动

First of all the expected output you provided is wrong. 首先,您提供的预期输出是错误的。 You specified a window size to be 2 minutes. 您指定窗口大小为2分钟。 So the output(assuming it is start and end of a window) should be: 所以输出(假设它是窗口的开始和结束)应该是:

1:00:00, 1:01:00
1:01:00, 1:02:00

The first event with timestamp 1:00:00 should be assigned to windows (0:59:00, 1:01:00) and (1:00:00, 1:02:00) . 应将时间戳1:00:00的第一个事件分配给windows (0:59:00, 1:01:00) (1:00:00, 1:02:00) (0:59:00, 1:01:00)(1:00:00, 1:02:00) I believe that answers your question. 我相信这回答了你的问题。

After edit: 编辑后:

For the countWindow the same rule is applied. 对于countWindow,应用相同的规则。 The first element belongs to two windows. 第一个元素属于两个窗口。 It is easier to reason with a countWindow(4,2) . 使用countWindow(4,2)更容易推理。 Have a look at a basic example: 看一个基本的例子:

val sEnv = StreamExecutionEnvironment.getExecutionEnvironment
sEnv.setParallelism(1)

sEnv.fromCollection((1 to 10)).countWindowAll(4, 2).apply(
  (window, numbers, collector: Collector[Seq[Int]]) =>
    collector.collect(numbers.toSeq)
).print()

sEnv.execute()

The output is: 输出是:

List(1, 2)
List(1, 2, 3, 4)
List(3, 4, 5, 6)
List(5, 6, 7, 8)
List(7, 8, 9, 10)

See that the first window that first element belongs starts in the past. 看到第一个元素所属的第一个窗口在过去开始。

I understood thanks to Dawid Wysakowicz's answer. 我理解感谢Dawid Wysakowicz的回答。 I just wanted to add a figure hoping it could help understanding. 我只想添加一个希望它可以帮助理解的数字。

在此输入图像描述

Indeed, in sliding windows, each element has to be entailed into 2 windows. 实际上,在滑动窗户中,每个元素必须包含在2个窗口中。 That is, the first element has to be in 2 windows as well. 也就是说,第一个元素也必须在2个窗口中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM