简体   繁体   English

我可以配置 Google DataFlow 以在排空管道时保持节点正常运行吗

[英]Can I configure Google DataFlow to keep nodes up when I drain a pipeline

I am deploying a pipeline to Google Cloud DataFlow using Apache Beam.我正在使用 Apache Beam 将管道部署到 Google Cloud DataFlow。 When I want to deploy a change to the pipeline, I drain the running pipeline and redeploy it.当我想将更改部署到管道时,我排空正在运行的管道并重新部署它。 I would like to make this faster.我想让这个更快。 It appears from the logs that on each deploy DataFlow builds up new worker nodes from scratch: I see Linux boot messages going by.从日志中可以看出,在每次部署时,DataFlow 都会从头开始构建新的工作节点:我看到 Linux 条引导消息经过。

Is it possible to drain the pipeline without tearing down the worker nodes so the next deployment can reuse them?是否可以在不拆除工作节点的情况下排空管道,以便下一次部署可以重用它们?

rewriting Inigo's answer here:在这里重写 Inigo 的答案:

Answering the original question, no, there's no way to do that.回答最初的问题,不,没有办法做到这一点。 Updating should be the way to go. I was not aware it was marked as experimental (probably we should change that), but the update approach has not changed in the last 3 i have been using DF.更新应该是到 go 的方式。我不知道它被标记为实验性的(可能我们应该改变它),但是在我使用 DF 的最后 3 年中更新方法没有改变。 About the special cases of update not working, supposing your feature existed, the workers would still need the new code, so no really much to save, and update should work in most of the other cases.关于更新不工作的特殊情况,假设你的功能存在,工作人员仍然需要新代码,所以没有太多可以节省,并且更新应该在大多数其他情况下工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM