简体   繁体   English

Apache 光束中的开窗和水印:Google 数据流

[英]Windowing and Watermark in Apache beam : Google dataflow

I have a fixed window of 1 minute.我有一个 1 分钟的固定 window。 I am considering event time.我正在考虑活动时间。

beam.WindowInto(window.FixedWindows(300)) beam.WindowInto(window.FixedWindows(300))

When I deploy this code,is the window created instantly even if I have not published any message.suppose I deployed at 6:30, is it like the windows are automatically created as 6:30 to 6:35, 6:35 to 6:40 and so on?当我部署这段代码时,即使我没有发布任何消息,window 是否会立即创建。假设我在 6:30 部署,它是否像 windows 一样自动创建为 6:30 到 6:35、6:35 到 6 :40 等等?

If I publish a message to topic having event timestamp = 6:31 (unix seconds ie 10,176589653) when system time = 6:36..does it mean the watermark for that specific message is at 6:31 and it will miss the window as system time is at 6:36 and allowed lateness=0 and will be rejected.如果我在系统时间 = 6:36 时向具有事件时间戳 = 6:31(unix 秒,即 10,176589653)的主题发布消息,这是否意味着该特定消息的水印在 6:31 并且它会错过window 因为系统时间是 6:36 并且允许迟到 = 0 并且将被拒绝。

Windows are always created using UNIX time 0 as a base, meaning, no matter if you start the pipeline at 6:31, 6:32 or 6:35, the windows would always be [6:30, 6:35), [6:35, 6:40)... . Windows 始终使用 UNIX 时间 0 作为基础创建,这意味着,无论您是在 6:31、6:32 还是 6:35 开始管道,windows 始终是[6:30, 6:35), [6:35, 6:40)... Note that this also applies for days, the windows would start at 00:00 UTC.请注意,这也适用于几天,windows 将从 00:00 UTC 开始。

If you want to change this, there's an offset parameter .如果你想改变这个,有一个offset参数

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Google Dataflow 上的 Apache Beam 示例的权限错误 - Permissions error with Apache Beam example on Google Dataflow Google Dataflow 和 Apache 光束:为什么使用 ValueProvider - Google Dataflow and Apache beam: why ValueProvider 在 Apache Beam/Google Cloud Dataflow 上创建文件和数据流 - Creating a file and streaming in data on Apache Beam/Google Cloud Dataflow 访问 PCollectionView 的元素<list<foo> &gt;: 谷歌云数据流/Apache Beam </list<foo> - Access elements of PCollectionView<List<Foo>> : Google Cloud Dataflow/Apache Beam 在 Apache Beam DoFn(谷歌数据流)中下载和上传文件到 GCP 存储桶 - downloading and uploading file to GCP bucket in Apache Beam DoFn(Google Dataflow) 在 Google Cloud Dataflow 上运行的 Apache Beam 中禁用特定 class 的日志记录 - Disable logging from a specific class in Apache Beam running on Google Cloud Dataflow 是否可以在云数据流谷歌云平台中使用 apache 光束执行存储过程 MySQL Azure? - Is possible to execute Stored Procedure MySQL Azure using apache beam in cloud dataflow google cloud platform? Apache Beam 数据流管道使用 Bazel 构建和部署 - Apache Beam Dataflow pipeline build and deploy with Bazel 无法使用 Dataflow Apache Beam 沉入 BigQuery - Can not sink to BigQuery using Dataflow Apache Beam Spring Cloud Dataflow 与 Apache Beam/GCP 数据流说明 - Spring Cloud Dataflow vs Apache Beam/GCP Dataflow Clarification
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM