简体   繁体   English

如何在 AWS EMR 上使用 Apache Flink 安全地更新正在进行的作业?

[英]How to safely update jobs in-flight using Apache Flink on AWS EMR?

I was not able to find instructions for how to update code safely.我找不到有关如何安全更新代码的说明。 I see Flink docs on how to use savepoints.我看到有关如何使用保存点的 Flink 文档。 I'd expect an easy solution for updating Flink jobs in AWS EMR.我希望有一个简单的解决方案来更新 AWS EMR 中的 Flink 作业。

https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/aws.html https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/aws.ZFC35FDC70D5FC69D269883A822C7AE53

https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/upgrading.html https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/upgrading.html

https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html

I was expecting instructions like the following (but not for Dataflow and Apache Beam):我期待如下指令(但不适用于 Dataflow 和 Apache Beam):

https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline

https://medium.com/google-cloud/restarting-cloud-dataflow-in-flight-9c688c49adfd https://medium.com/google-cloud/restarting-cloud-dataflow-in-flight-9c688c49adfd

To achieve that You need to cancel Your job with savepoint whether by using Flink command line interface or via the REST API .要实现这一点,无论是使用 Flink 命令行界面还是通过REST API ,您都需要使用保存点取消您的作业。 In both cases, You will receive the path for the savepoint(in case of REST API you will receive the request-id since cancel is an async operation, but You can use that to retrieve savepoint path).在这两种情况下,您都会收到保存点的路径(如果是 REST API,您将收到请求 ID,因为取消是异步操作,但您可以使用它来检索保存点路径)。

After getting the savepoint path, You will be able to start a new job, again both via REST API or CLI, you will be able to provide the path to the savepoint when starting the job so that Flink will automatically restore the state from Savepoint, including all records that were in-flight.获取保存点路径后,您将能够再次通过 REST API 或 CLI 启动新作业,您将能够在启动作业时提供保存点的路径,以便 Flink 自动从 Savepoint 恢复 Z9ED39E2EA9312586B6A包括所有正在进行的记录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM