简体繁体 English

我可以配置 Google DataFlow 以在排空管道时保持节点正常运行吗

[英]Can I configure Google DataFlow to keep nodes up when I drain a pipeline

原文 2021-10-08 21:51:02 6 1 google-cloud-platform/ google-cloud-dataflow/ apache-beam

I am deploying a pipeline to Google Cloud DataFlow using Apache Beam.我正在使用 Apache Beam 将管道部署到 Google Cloud DataFlow。 When I want to deploy a change to the pipeline, I drain the running pipeline and redeploy it.当我想将更改部署到管道时，我排空正在运行的管道并重新部署它。 I would like to make this faster.我想让这个更快。 It appears from the logs that on each deploy DataFlow builds up new worker nodes from scratch: I see Linux boot messages going by.从日志中可以看出，在每次部署时，DataFlow 都会从头开始构建新的工作节点：我看到 Linux 条引导消息经过。

Is it possible to drain the pipeline without tearing down the worker nodes so the next deployment can reuse them?是否可以在不拆除工作节点的情况下排空管道，以便下一次部署可以重用它们？

1 个解决方案

rewriting Inigo's answer here:在这里重写 Inigo 的答案：

Answering the original question, no, there's no way to do that.回答最初的问题，不，没有办法做到这一点。 Updating should be the way to go. I was not aware it was marked as experimental (probably we should change that), but the update approach has not changed in the last 3 i have been using DF.更新应该是到 go 的方式。我不知道它被标记为实验性的（可能我们应该改变它），但是在我使用 DF 的最后 3 年中更新方法没有改变。 About the special cases of update not working, supposing your feature existed, the workers would still need the new code, so no really much to save, and update should work in most of the other cases.关于更新不工作的特殊情况，假设你的功能存在，工作人员仍然需要新代码，所以没有太多可以节省，并且更新应该在大多数其他情况下工作。

我可以将 google DataFlow 与本机 python 一起使用吗？ - Can I use google DataFlow with native python?

Google DataFlow 更新现有管道 - Google DataFlow Updating an existing pipeline

为什么我在使用 Dataflow 管道时遇到“错误同步 pod”？ - Why did I encounter an "Error syncing pod" with Dataflow pipeline?

作业开始后，我可以动态更改 Google Dataflow 中的日志级别吗？ - Can I dynamically alter log levels in Google Dataflow once the job has started?

如何在 ADO 管道中配置任务 AzureFunctionApp 的 ENV 值？ - How can I configure ENV values from task AzureFunctionApp in ADO pipeline?

如何在特定文件集从谷歌云到达云存储时启动云数据流管道 function - how to launch a cloud dataflow pipeline when particular set of files reaches Cloud storage from a google cloud function

使用 terraform 清空 GCP 数据流作业 - Drain GCP dataflow jobs with terraform

在谷歌数据流管道中构建容器 - Building containers within google dataflow pipeline

如何指定 GCP 数据流的 IP 号？ - How can I specify the IP number of GCP's Dataflow?

Dataflow job drain 长时间没有结束 - Dataflow job drain does not end for a long time

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我可以将 google DataFlow 与本机 python 一起使用吗？ - Can I use google DataFlow with native python? Google DataFlow 更新现有管道 - Google DataFlow Updating an existing pipeline 为什么我在使用 Dataflow 管道时遇到“错误同步 pod”？ - Why did I encounter an "Error syncing pod" with Dataflow pipeline? 作业开始后，我可以动态更改 Google Dataflow 中的日志级别吗？ - Can I dynamically alter log levels in Google Dataflow once the job has started? 如何在 ADO 管道中配置任务 AzureFunctionApp 的 ENV 值？ - How can I configure ENV values from task AzureFunctionApp in ADO pipeline? 如何在特定文件集从谷歌云到达云存储时启动云数据流管道 function - how to launch a cloud dataflow pipeline when particular set of files reaches Cloud storage from a google cloud function 使用 terraform 清空 GCP 数据流作业 - Drain GCP dataflow jobs with terraform 在谷歌数据流管道中构建容器 - Building containers within google dataflow pipeline 如何指定 GCP 数据流的 IP 号？ - How can I specify the IP number of GCP's Dataflow? Dataflow job drain 长时间没有结束 - Dataflow job drain does not end for a long time

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM