简体   繁体   English

为什么在 apache-beam 中出现错误:“TypeError:使用 SessionWindow 时无法将 GlobalWindow 转换为 _IntervalWindowBase?

[英]Why in apache-beam I get error: "TypeError: Cannot convert GlobalWindow to _IntervalWindowBase when using SessionWindow?

When I use session window with 1h gap and after processing million of messages I get error in logs, probably just for some rows:当我使用 session window 时,间隔 1 小时,在处理数百万条消息后,我在日志中收到错误,可能只是某些行:

TypeError: Cannot convert GlobalWindow to apache_beam.utils.windowed_value._IntervalWindowBase

Code:代码:

grouped_tis = tracking_informations | beam.WindowInto(window.Sessions(session_window_gap)) | beam.GroupByKey() | beam.ParDo(MergeTI()) | "TI model -> json" >> beam.Map(jsons.dump)

Full stack: https://pastebin.com/pqA5pMay全栈: https://pastebin.com/pqA5pMay

这可能是因为某些代码(例如MergeTI )正在返回GlobalWindow元素,而PCollection具有不同的窗口集: beam.WindowInto(window.Sessions(session_window_gap))

If anyone is experiencing same problem, I solve this problem insert beam.WindowInto(beam.window.GlobalWindows()) between beam.WindowInto(NONGLOBALWINDOW) | beam.GroupByKey()如果有人遇到同样的问题,我解决这个问题在 beam.WindowInto(NONGLOBALWINDOW) | 之间插入beam.WindowInto(beam.window.GlobalWindows()) beam.WindowInto(NONGLOBALWINDOW) | beam.GroupByKey() beam.WindowInto(NONGLOBALWINDOW) | beam.GroupByKey() and other Ptransform that causes problem. beam.WindowInto(NONGLOBALWINDOW) | beam.GroupByKey()和其他导致问题的 Ptransform。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 DataflowRunner“无法将 GlobalWindow 转换为 apache_beam.utils.windowed_value._IntervalWindowBase”使用 SlidingWindows 但 DirectRunner 有效吗? - DataflowRunner "Cannot convert GlobalWindow to apache_beam.utils.windowed_value._IntervalWindowBase" using SlidingWindows yet DirectRunner works? 无法使用 Apache-Beam JDBC 连接到 Cloud SQL - Cannot connect to Cloud SQL using Apache-Beam JDBC 我们可以在 apache-beam 的批处理管道中使用 Windows + GroupBy 或 State &amp; timely 打破 fusion b/w ParDo 吗? - Can we break fusion b/w ParDo using Windows + GroupBy or State & timely in batch pipeline of apache-beam? apache-beam 从 GCS 桶的多个文件夹中读取多个文件并加载它 biquery python - apache-beam reading multiple files from multiple folders of GCS buckets and load it biquery python 使用哪个 apache-beam 功能来读取管道中的第一个 function 并获取 output - Which apache-beam feature to use to just read a function as first in the pipeline and take the output Google Dataflow 上的 Apache Beam 示例的权限错误 - Permissions error with Apache Beam example on Google Dataflow 我无法将 Firebase 时间戳转换为 ISO 格式,因为我收到此错误:TypeError: thread.createdAt.toDate is not a function - I cannot convert a Firebase Timestamp to ISO format because I get this error: TypeError: thread.createdAt.toDate is not a function 如何转换 PCollection<tablerow> 到个人收藏<row>在 Apache 梁?</row></tablerow> - How to convert PCollection<TableRow> to PCollection<Row> in Apache Beam? 使用 Apache Beam 模式的效果 - Effect of using Apache Beam schemas 为什么在使用 quarkus-amazon-lambda 和 quarkus-smallrye-openapi 包时会出现错误? - Why do I get an error when using quarkus-amazon-lambda and quarkus-smallrye-openapi packages?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM