[英]Airflow: how to mark ExternalTaskSensor operator as Success after timeout
I have a dag A
, which is waiting for some other operators in other dags B
and C
to download the data, and then performs come computations on it. 我有一个dag
A
,它正在等待其他dag B
和C
其他运算符下载数据,然后对其进行计算。
It for some operators in dags B
and C
it takes too long, I'd like to continue without "hanging" operators and use whatever data I received so far. 对于
B
和C
某些运算符来说,它花费的时间太长,我想继续使用,而不用“挂起”运算符,而使用到目前为止我收到的任何数据。
Thus, I have a timeout, and I'd like to mark my ExternalTaskSensor
s as Success after a given timeout. 因此,我有一个超时,并且我想在给定的超时后将我的
ExternalTaskSensor
标记为Success。
How can I do that? 我怎样才能做到这一点?
# dag A:
wait_for_task_1 = ExternalTaskSensor(
task_id='wait_B_task_1',
external_dag_id='B',
external_task_id='task_1',
dag=dag,
timeout=(4 * 3600) # After 4 hours, I want to continue A "as is"
)
That is currently not possible but what you can do is set trigger_rule='all_done'
on task that is directly dependent on wait_for_task_1
. 目前尚不可能,但是您可以对直接依赖于
wait_for_task_1
任务设置trigger_rule='all_done'
。
Example: 例:
wait_for_task_1 = ExternalTaskSensor(
task_id='wait_B_task_1',
external_dag_id='B',
external_task_id='task_1',
dag=dag,
timeout=(4 * 3600) # After 4 hours, I want to continue A "as is"
)
task_2 = DummyOperator(task_id='task_2', trigger_rule='all_done', dag=dag)
wait_for_task_1 >> task_2
That will allow the downstream task to be run even though the task has failed. 即使任务失败,这也将允许下游任务运行。 The default
trigger_rule
for all tasks is all_success
. 所有任务的默认
trigger_rule
为all_success
。
Docs: https://airflow.apache.org/concepts.html#trigger-rules 文件: https : //airflow.apache.org/concepts.html#trigger-rules
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.