I have a dag A
, which is waiting for some other operators in other dags B
and C
to download the data, and then performs come computations on it.
It for some operators in dags B
and C
it takes too long, I'd like to continue without "hanging" operators and use whatever data I received so far.
Thus, I have a timeout, and I'd like to mark my ExternalTaskSensor
s as Success after a given timeout.
How can I do that?
# dag A:
wait_for_task_1 = ExternalTaskSensor(
task_id='wait_B_task_1',
external_dag_id='B',
external_task_id='task_1',
dag=dag,
timeout=(4 * 3600) # After 4 hours, I want to continue A "as is"
)
That is currently not possible but what you can do is set trigger_rule='all_done'
on task that is directly dependent on wait_for_task_1
.
Example:
wait_for_task_1 = ExternalTaskSensor(
task_id='wait_B_task_1',
external_dag_id='B',
external_task_id='task_1',
dag=dag,
timeout=(4 * 3600) # After 4 hours, I want to continue A "as is"
)
task_2 = DummyOperator(task_id='task_2', trigger_rule='all_done', dag=dag)
wait_for_task_1 >> task_2
That will allow the downstream task to be run even though the task has failed. The default trigger_rule
for all tasks is all_success
.
Docs: https://airflow.apache.org/concepts.html#trigger-rules
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.