简体   繁体   English

如果 BranchPythonOperator 结果为 False,则不应跳过任务的其余部分

[英]The rest of the task should not skip if BranchPythonOperator result is False

I have an Airflow DAG, that branches to whether to send an email or not.我有一个 Airflow DAG,它分支到是否发送电子邮件。 will_send_email_task is a BranchPythonOperator, that if len(message) > 0 , it should go to the branch task send_email_notification_task . will_send_email_task是一个 BranchPythonOperator, if len(message) > 0 ,它应该去分支任务send_email_notification_task Otherwise, that task is skipped and will go straight to the DummyOperator join_task .否则,该任务将被跳过并直接转到 DummyOperator join_task The DAG works OK when the result of the branch is True (yes, it should send an email).当分支的结果为True时,DAG 工作正常(是的,它应该发送电子邮件)。 However, the rest of the DAG is skipped when the result is False , which is not what the expected.但是,当结果为False时,将跳过 DAG 的其余部分,这不是预期的结果。 The expected outcome if will_send_email_task is False should be that only send_email_notification_task is skipped/bypassed, but the rest of the flow continue as normal.如果will_send_email_taskFalse的预期结果应该是只有send_email_notification_task被跳过/绕过,但流程的其余部分继续正常。

跳过 DAG

Here is the Airflow DAG snippet:这是 Airflow DAG 片段:


# this function determines whether to send an email or not
def will_send_email(push_task, **context):
    message = context["task_instance"].xcom_pull(task_ids=push_task)

    if len(message) > 0:
        logging.info(f"email body: {message}")
        context["task_instance"].xcom_push(key="message", value=message)
        return 'send_email_notification_task'
    else:
        return 'join_task'
        
def some_python_callable(table_name, **context):
    ...

will_send_email_task = BranchPythonOperator(
    task_id='will_send_email_task',
    provide_context=True,
    python_callable=will_send_email,
    op_kwargs={'push_task': 'some_previous_task'},
    dag=dag
)


join_task = DummyOperator(
    task_id='join_task',
    dag=dag
)

send_email_notification_task = EmailOperator(
    task_id='send_email_notification_task',
    to=default_args['email'],
    subject="some email subject",
    html_content="{{ task_instance.xcom_pull(key='message') }}",
    dag=dag
)

end_task = DummyOperator(
    task_id='end_task',
    dag=dag
)

...

for table, val in some_dict.items():

    offload_task = PythonOperator(
        task_id = f"offload_{table}_task",
        dag=dag,
        provide_context=True,
        python_callable=some_python_callable,
        op_kwargs={'table_name': table}
    )
    
    offload_task.set_upstream(join_task)
    offload_task.set_downstream(end_task)

How should I configure my DAG so it would still run as expected?我应该如何配置我的 DAG 以使其仍按预期运行?

You will need to set trigger_rule='none_failed_min_one_success' for the join_task :您需要为 join_task 设置trigger_rule='none_failed_min_one_success' join_task

join_task = DummyOperator(
    task_id='join_task',
    dag=dag,
   trigger_rule='none_failed_min_one_success'
)

This is a use case which explained in trigger rules docs .这是一个在触发规则文档中解释的用例。 The default trigger rule is all_success but in your case one of the upstream task of join_task is guaranteed to be skipped so you can not use the default trigger rule.默认触发规则是all_success但在您的情况下,保证会跳过join_task的上游任务之一,因此您不能使用默认触发规则。

Note: For Airflow < 2.2 use trigger_rule='none_failed_or_skipped' trigger_rule='none_failed_or_skipped' The trigger rule was just renamed in later version as it's name was confusing (see PR ), you can also use trigger_rule='all_done' .注意:对于 Airflow < 2.2 使用trigger_rule='none_failed_or_skipped' trigger_rule='none_failed_or_skipped'触发规则只是在后来的版本中重命名,因为它的名称令人困惑(参见PR ),您也可以使用trigger_rule='all_done'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM