简体   繁体   中英

How to branch multiple paths in Airflow DAG using branch operator?

This is what I want, but I don't know how to achieve this in airflow, as both of the tasks are being executed.

在此处输入图像描述

To summarize:

  • T1 executes
  • T2 executes
  • Based on the output of T2 I want to either go option_1 -> complete or option_2 -> Do_x, Do_y -> complete

How should I structure this? I have this as my current code:

(t1 >> t2 >> option_1 >> complete)
(t1 >> t2 >> option_2 >> do_x >> do_y >> complete)

t2 in this case is a branch operator.

I've also tried the syntax for ... [option_1, option_2]... but I need a completely separate path to execute, not just a single task to be switched.

The dependancies you have in your code are correct for branching. Make sure BranchPythonOperator returns the task_id of the task at the start of the branch based on whatever logic you need. More info on the BranchPythonOperator here . One last important note is related to the "complete" task. Since branches converge on the "complete" task, make sure the trigger_rule is set to "none_failed" (you can also use the TriggerRule class constant as well) so the task doesn't get skipped.

Quick code test for your reference:

from airflow.models import DAG
from airflow.operators.dummy import DummyOperator
from airflow.operators.python import BranchPythonOperator
from airflow.utils.trigger_rule import TriggerRule

from datetime import datetime


DEFAULT_ARGS = dict(
    start_date=datetime(2021, 5, 5),
    owner="airflow",
    retries=0,
)

DAG_ARGS = dict(
    dag_id="multi_branch",
    schedule_interval=None,
    default_args=DEFAULT_ARGS,
    catchup=False,
)


def random_branch():
    from random import randint

    return "option_1" if randint(1, 2) == 1 else "option_2"


with DAG(**DAG_ARGS) as dag:
    t1 = DummyOperator(task_id="t1")

    t2 = BranchPythonOperator(task_id="t2", python_callable=random_branch)

    option_1 = DummyOperator(task_id="option_1")

    option_2 = DummyOperator(task_id="option_2")

    do_x = DummyOperator(task_id="do_x")

    do_y = DummyOperator(task_id="do_y")

    complete = DummyOperator(task_id="complete", trigger_rule=TriggerRule.NONE_FAILED)

    t1 >> t2 >> option_1 >> complete
    t1 >> t2 >> option_2 >> do_x >> do_y >> complete

带有 BranchPythonOperator 的 DAG

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM