简体   繁体   English

Airflow BranchPythonOperator - 分支后继续

[英]Airflow BranchPythonOperator - Continue After Branch

I have the following operators as shown below.我有以下运算符,如下所示。 I am able to visually see the the graph representation looks correct.我能够直观地看到图形表示看起来是正确的。 However, I am not able to get the functionality to work properly as in the paths will not keep going past either branch.但是,我无法使功能正常工作,因为路径不会继续经过任何一个分支。 Regardless of the date, neither path will keep going to task_05.无论日期如何,两条路径都不会继续到 task_05。

The DAG has two (2) paths: DAG 有两 (2) 条路径:

(1) If it is the first of the month, task_01->test_step->task_02->task_05->task_06 (1)如果是月初,task_01->test_step->task_02->task_05->task_06

(2) If it is not the first of the month, task_01->test_step->task_03->task_04->task_05->task_06 (2)如果不是月初,task_01->test_step->task_03->task_04->task_05->task_06

Problem: The functionality does not keep the DAG to complete all the way throug task_06.问题:该功能无法让 DAG 一直完成到 task_06。

Assuming the problems resides in the way I am using the bitwise operators.假设问题在于我使用按位运算符的方式。

Code代码

from airflow.operators.python_operator import BranchPythonOperator
from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime, timedelta
from airflow.models import DAG


default_args = {
    'owner': 'astro',
    'depends_on_past': False,
    'start_date': datetime(2020, 5, 1),
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=5),
}

dag = DAG(
    dag_id='astro_test',
    default_args=default_args,
    schedule_interval="0 10 * * *",
    catchup=True,
    max_active_runs=1
)

def dom_branch(ds_nodash, **kwargs):

    if ds_nodash[-2:] == '01':
        return "task_02"
    else:
        return "task_03"

with dag:
    one = DummyOperator(task_id='task_01')
    two = DummyOperator(task_id='task_02')
    three = DummyOperator(task_id='task_03')
    four = DummyOperator(task_id='task_04')
    five = DummyOperator(task_id='task_05')
    six = DummyOperator(task_id='task_06')

    dom_operator = BranchPythonOperator(
        task_id=f"test_step",
        provide_context=True,
        op_kwargs={'ds_nodash': '{{ ds_nodash }}'},
        python_callable=dom_branch
    )

    one >> dom_operator >> [two, three] 
    two >> five
    three >> four >> five   
    five >> six

1st of month每月 1 日DAG 图表视图的屏幕截图(本月 1 日)

Not 1st of month不是每月 1 日DAG 图形视图的屏幕截图(不是 1 日)

This is due that task_5 requires both upstreams to finish before it can start.这是因为task_5需要两个上游都完成才能开始。 Instead, you can add a trigger rule: https://airflow.apache.org/docs/stable/concepts.html#trigger-rules相反,您可以添加触发规则: https://airflow.apache.org/docs/stable/concepts.html#trigger-rules

five = DummyOperator(
    task_id='task_05',
    trigger_rule=TriggerRule.ONE_SUCCESS,
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM