简体   繁体   English

气流:超时后如何将ExternalTask​​Sensor操作员标记为成功

[英]Airflow: how to mark ExternalTaskSensor operator as Success after timeout

I have a dag A , which is waiting for some other operators in other dags B and C to download the data, and then performs come computations on it. 我有一个dag A ,它正在等待其他dag BC其他运算符下载数据,然后对其进行计算。

It for some operators in dags B and C it takes too long, I'd like to continue without "hanging" operators and use whatever data I received so far. 对于BC某些运算符来说,它花费的时间太长,我想继续使用,而不用“挂起”运算符,而使用到目前为止我收到的任何数据。

Thus, I have a timeout, and I'd like to mark my ExternalTaskSensor s as Success after a given timeout. 因此,我有一个超时,并且我想在给定的超时后将我的ExternalTaskSensor标记为Success。
How can I do that? 我怎样才能做到这一点?

# dag A:

wait_for_task_1 = ExternalTaskSensor(
    task_id='wait_B_task_1',
    external_dag_id='B',
    external_task_id='task_1',
    dag=dag,
    timeout=(4 * 3600) # After 4 hours, I want to continue A "as is"
)

That is currently not possible but what you can do is set trigger_rule='all_done' on task that is directly dependent on wait_for_task_1 . 目前尚不可能,但是您可以对直接依赖于wait_for_task_1任务设置trigger_rule='all_done'

Example: 例:

wait_for_task_1 = ExternalTaskSensor(
    task_id='wait_B_task_1',
    external_dag_id='B',
    external_task_id='task_1',
    dag=dag,
    timeout=(4 * 3600) # After 4 hours, I want to continue A "as is"
)

task_2 = DummyOperator(task_id='task_2', trigger_rule='all_done', dag=dag)

wait_for_task_1 >> task_2

That will allow the downstream task to be run even though the task has failed. 即使任务失败,这也将允许下游任务运行。 The default trigger_rule for all tasks is all_success . 所有任务的默认trigger_ruleall_success

Docs: https://airflow.apache.org/concepts.html#trigger-rules 文件: https//airflow.apache.org/concepts.html#trigger-rules

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM