简体   繁体   English

如何在 Airflow 中设置 DAG 之间的依赖关系?

[英]How to set dependencies between DAGs in Airflow?

I am using Airflow to schedule batch jobs.我正在使用Airflow来安排批处理作业。 I have one DAG (A) that runs every night and another DAG (B) that runs once per month.我有一个每天晚上运行的 DAG (A) 和另一个每月运行一次的 DAG (B)。 B depends on A having completed successfully. B 取决于 A 已成功完成。 However B takes a long time to run and so I would like to keep it in a separate DAG to allow better SLA reporting.但是 B 需要很长时间才能运行,所以我想将它保存在一个单独的 DAG 中,以便更好地报告 SLA。

How can I make running DAG B dependent on a successful run of DAG A on the same day?如何使运行 DAG B 依赖于在同一天成功运行 DAG A?

You can achieve this behavior using an operator called ExternalTaskSensor.您可以使用名为 ExternalTask​​Sensor 的运算符来实现此行为。 Your task (B1) in DAG(B) will be scheduled and wait for a success on task (A2) in DAG(A)您在 DAG(B) 中的任务 (B1) 将被安排并等待 DAG(A) 中任务 (A2) 的成功

External Task Sensor documentation 外部任务传感器文档

It looks like a TriggerDagRunOperator can be used as well, and you can use a python callable to add some logic.看起来也可以使用TriggerDagRunOperator ,并且您可以使用 python 可调用来添加一些逻辑。 As explained here : https://www.linkedin.com/pulse/airflow-lesson-1-triggerdagrunoperator-siddharth-anand如此处所述: https : //www.linkedin.com/pulse/airflow-lesson-1-triggerdagrunoperator-siddharth-anand

When cross-DAG dependency is needed, there are often two requirements:当需要跨DAG依赖时,往往有两个需求:

  1. Task B1 on DAG B needs to run after task A1 on DAG A is done. DAG B上的任务B1需要在 DAG A上的任务A1完成后运行。 This can be achieved using ExternalTaskSensor as others have mentioned:正如其他人提到的,这可以使用ExternalTaskSensor来实现:

     B1 = ExternalTaskSensor(task_id="B1", external_dag_id='A', external_task_id='A1', mode="reschedule")
  2. When user clears task A1 on DAG A , we want Airflow to clear task B1 on DAG B to let it re-run.当用户清除 DAG A上的任务A1时,我们希望 Airflow 清除 DAG B上的任务B1以使其重新运行。 This can be achieved using ExternalTaskMarker (since Airflow v1.10.8).这可以使用ExternalTaskMarker实现(自 Airflow v1.10.8 起)。

     A1 = ExternalTaskMarker(task_id="A1", external_dag_id="B", external_task_id="B1")

Please see the doc about cross-DAG dependencies for more details: https://airflow.apache.org/docs/stable/howto/operator/external.html有关更多详细信息,请参阅有关跨 DAG 依赖项的文档: https : //airflow.apache.org/docs/stable/howto/operator/external.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM