简体   繁体   中英

Airflow: how to mark ExternalTaskSensor operator as Success after timeout

I have a dag A , which is waiting for some other operators in other dags B and C to download the data, and then performs come computations on it.

It for some operators in dags B and C it takes too long, I'd like to continue without "hanging" operators and use whatever data I received so far.

Thus, I have a timeout, and I'd like to mark my ExternalTaskSensor s as Success after a given timeout.
How can I do that?

# dag A:

wait_for_task_1 = ExternalTaskSensor(
    task_id='wait_B_task_1',
    external_dag_id='B',
    external_task_id='task_1',
    dag=dag,
    timeout=(4 * 3600) # After 4 hours, I want to continue A "as is"
)

That is currently not possible but what you can do is set trigger_rule='all_done' on task that is directly dependent on wait_for_task_1 .

Example:

wait_for_task_1 = ExternalTaskSensor(
    task_id='wait_B_task_1',
    external_dag_id='B',
    external_task_id='task_1',
    dag=dag,
    timeout=(4 * 3600) # After 4 hours, I want to continue A "as is"
)

task_2 = DummyOperator(task_id='task_2', trigger_rule='all_done', dag=dag)

wait_for_task_1 >> task_2

That will allow the downstream task to be run even though the task has failed. The default trigger_rule for all tasks is all_success .

Docs: https://airflow.apache.org/concepts.html#trigger-rules

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM