简体   繁体   English

Google Cloud Dataflow Python SDK 更新

[英]Google Cloud Dataflow Python SDK updates

On using the Google Cloud Dataflow Python SDK happens that at start reading a lot of data from the Cloud Storage it takes a while and causes the error AssertionError: Job did not reach to a terminal state after waiting indefinitely .在使用 Google Cloud Dataflow Python SDK 时,会发生在开始从 Cloud Storage 读取大量数据时需要一段时间并导致错误AssertionError: Job did not reach to a terminal state after waiting indefinitely

Doing a search we found the open issue BEAM-5529 which refers to the patch #6535 released in version 2.8.0 but not present in the release note.通过搜索,我们发现了未解决的问题BEAM-5529 ,它指的是 2.8.0 版中发布的补丁#6535 ,但未出现在发行说明中。

In the other hand the current published version is google-cloud-dataflow 2.5.0另一方面,当前发布的版本是 google-cloud-dataflow 2.5.0

There is any update policy or is it the individual responsibility to perform the compilation and generation of a new version with the latest releases?是否有任何更新政策或个人有责任使用最新版本执行编译和生成新版本?

Please any help or comment will be welcome.请任何帮助或评论将受到欢迎。

As per official Google Cloud Platform docs here :根据此处的官方 Google Cloud Platform 文档:

The Cloud Dataflow SDK 2.5.0 is the last Cloud Dataflow SDK release that is separate from the Apache Beam SDK releases. Cloud Dataflow SDK 2.5.0 是与 Apache Beam SDK 版本分开的最后一个 Cloud Dataflow SDK 版本。 The Cloud Dataflow service fully supports official Apache Beam SDK releases. Cloud Dataflow 服务完全支持官方 Apache Beam SDK 版本。

So yes, google-cloud-dataflow 2.5.0 is the last release, and from that version on you should use the official apache-beam releases.所以是的,google-cloud-dataflow 2.5.0 是最后一个版本,从那个版本开始,你应该使用官方的 apache-beam 版本。 Bear in mind that you will need to install the library using the extra [gcp]:请记住, 您需要使用额外的 [gcp] 安装库:

pip install apache-beam[gcp]

Finally, the fix in 6535 should be applied already, since I installed the library "pip install apache-beam[gcp]===2.8.0" and I went to the file "apache_beam/runners/dataflow/dataflow_runner.py" and it has the fix applied there.最后,应该已经应用了6535 中的修复程序,因为我安装了库“pip install apache-beam[gcp]===2.8.0”并且我转到了文件“apache_beam/runners/dataflow/dataflow_runner.py”和它在那里应用了修复程序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM