简体   繁体   English

失败的Airflow DAG任务能否通过更改的参数重试

[英]Can a failed Airflow DAG Task Retry with changed parameter

With Airflow, is it possible to restart an upstream task if a downstream task fails? 使用Airflow,如果下游任务失败,是否可以重新启动上游任务? This seems to be against the "Acyclic" part of the term DAG. 这似乎违背了DAG术语的“非循环”部分。 I would think this is a common problem though. 我认为这是一个普遍的问题。

Background 背景

I'm looking into using Airflow to manage a data processing workflow that has been managed manually. 我正在研究使用Airflow来管理已手动管理的数据处理工作流。

There is a task that will fail if a parameter x is set too high, but increasing the parameter value gives better quality results. 如果将参数x设置得太高,则有一项任务会失败,但是增加参数值会带来更好的质量结果。 We have not found a way to calculate a safe but maximally high parameter x. 我们还没有找到一种计算安全但最大参数x的方法。 The process by hand has been to restart the job if failed with a lower parameter until it works. 如果使用较低的参数失败,则手动执行的过程是重新启动作业,直到它起作用为止。

The workflow looks something like this: 工作流程如下所示:

Task A - Gather the raw data 任务A-收集原始数据

Task B - Generate config file for job 任务B-为作业生成配置文件

Task C - Modify config file parameter x 任务C-修改配置文件参数x

Task D - Run the data manipulation Job 任务D-运行数据处理作业

Task E - Process Job results 任务E-处理作业结果

Task F - Generate reports 任务F-生成报告

Issue 问题

If task D fails because of parameter x being too high, I want to rerun task C and task D. This doesn't seem to be supported. 如果任务D由于参数x太高而失败,我想重新运行任务C和任务D。似乎不支持此操作。 I would really appreciate some guidance on how to handle this. 我非常感谢有关如何处理此问题的一些指导。

First of all: that's an excellent question, I wonder why it hasn't been discussed widely until now 首先:这是一个很好的问题,我想知道为什么到目前为止尚未对其进行广泛讨论


I can think of two possible approaches 我可以想到两种可能的方法

  1. Fusing Operators : As pointed out by @Kris , Combining Operators together appears to be the most obvious workaround 融合Operators :如@Kris所指出, Operators组合在一起似乎是最明显的解决方法

  2. Separate Top-Level DAG s : Read below 单独的顶级 DAG :阅读下面


Separate Top-Level DAGs approach 单独的顶级DAG方法

Given 给定

  • Say you have tasks A & B 假设您有任务A和B
  • A is upstream to B A在B的上游
  • You want execution to resume (retry) from A if B fails 如果B失败,您想从A恢复执行(重试)

(Possibile) Idea: If your'e feeling adventurous (可能)想法: 如果您喜欢冒险

  • Put tasks A & B in separate top-level DAG s, say DAG-A & DAG-B 将任务A和B分别放在单独的顶级 DAG ,例如DAG-A和DAG-B
  • At the end of DAG-A, trigger DAG-B using TriggerDagRunOperator 在DAG-A的末尾,使用TriggerDagRunOperator触发DAG-B
    • In all likelihood, you will also have to use an ExternalTaskSensor after TriggerDagRunOperator 很有可能,您还必须在TriggerDagRunOperator之后使用ExternalTaskSensor
  • In DAG-B, put a BranchPythonOperator after Task-B with trigger_rule=all_done 在DAG-B中,在Task-B之后放置一个BranchPythonOperatorBranchPythonOperatorBranchPythonOperator trigger_rule=all_done
  • This BranchPythonOperator should branch out to another TriggerDagRunOperator that then invokes DAG-A (again!) BranchPythonOperator应该分支到另一个TriggerDagRunOperator ,然后再调用DAG-A(再次!)。

Useful references 有用的参考


EDIT-1 编辑1

Here's a much simpler way that can achieve similar behaviour 这是可以实现类似行为的简单得多的方法

How can you re-run upstream task if a downstream task fails in Airflow (using Sub Dags) 如果下游任务在气流中失败,如何重新运行上游任务(使用Sub Dags)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM