简体   繁体   English

在 Airflow 1.10 重试创建 dataproc 集群

[英]Retry of dataproc cluster creation in Airflow 1.10

Hello all I need some help is Airflow.We use Airflow 1.10.We have a requirement where we want to retry the task of creating and deleting the dataproc cluster if the task fails.We do have retry parameter in Airflow 2.0 but we do not have any such parameter which retry's creating and deleting the cluster in airflow 1.10.Can anyone suggest any alternative so that if creation or deletion of dataproc cluster fails,then we can retry creating it.你好,我需要一些帮助是 Airflow。我们使用 Airflow 1.10。我们有一个要求,如果任务失败,我们要重试创建和删除 dataproc 集群的任务。我们在 Airflow 2.0 中确实有重试参数,但我们没有在 airflow 1.10 中重试创建和删除集群的任何此类参数。任何人都可以提出任何替代方案,以便如果 dataproc 集群的创建或删除失败,那么我们可以重试创建它。

DataprocCreateClusterOperator has retry parameter: DataprocCreateClusterOperator具有retry参数:

 :param retry: A retry object used to retry requests. If ``None`` is specified, requests will not be retried.

This ability was added in PR and available for Airflow>=1.10此功能是在PR中添加的,适用于 Airflow>=1.10

If you are using Airflow<2.0: You will need to:如果您使用的是 Airflow<2.0:您将需要:

pip install apache-airflow-backport-providers-google

If you are using Airflow>=2.0: You will need to:如果您使用的是 Airflow>=2.0:您将需要:

pip install apache-airflow-providers-google

Then you can import the operator as:然后您可以将运算符导入为:

from airflow.providers.google.cloud.operators.dataproc import DataprocCreateClusterOperator

and use it as:并将其用作:

create_cluster_operator = DataprocCreateClusterOperator(
    task_id='create_dataproc_cluster',
    cluster_name="test",
    ...,
    retry=YOUR_RETRY_VALUE
    timeout=YOUR_TIMEOUT_VALUE
)

Note that all Airflow operators inherits from BaseOperator which has retries parameter:请注意,所有 Airflow 运算符都继承自具有retries参数的BaseOperator

 :param retries: the number of retries that should be performed before failing the task

Don't be confused between retry and retries .不要混淆retryretries

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在没有 SSH 的 Dataproc 集群上从 airflow 触发 spark 提交作业 - Trigger spark submit jobs from airflow on Dataproc Cluster without SSH Dataproc 集群无法初始化 - Dataproc cluster fails to initialize 如何在 Airflow 集群创建操作符中指定抢占式 SPOT VM - How to specify pre-emptible SPOT VMs in Airflow cluster creation operator airflow 操作员使用 gcloud beta dataproc 命令 - airflow operator to use gcloud beta dataproc commands 如何恢复 dataproc 集群中已删除的主节点? - How to recover deleted Master Node in dataproc cluster? dataproc 上的组件网关激活不适用于 composer(airflow) 运算符 airflow.providers.google.cloud.operators.dataproc - Component Gateway activation on dataproc does not work with composer(airflow) operator airflow.providers.google.cloud.operators.dataproc 暂停 Dataproc 集群 - Google 计算引擎 - Pausing Dataproc cluster - Google Compute engine 在 Dataproc 集群节点上设置环境变量 - Setting environment variables on Dataproc cluster nodes 在 Dataproc 集群上部署时 Spark 应用程序失败 - Spark Application Failing when deployed on Dataproc cluster 创建自定义 dataproc 图像时无法安装 python3 包 - Unable to install python3 packages while creation of custom dataproc image
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM