简体   繁体   English

如何为 Apache Airflow DAG 定义超时?

[英]How to define a timeout for Apache Airflow DAGs?

I'm using airflow 1.10.2 but Airflow seems to ignore the timeout I've set for the DAG.我使用的是 airflow 1.10.2,但 Airflow 似乎忽略了我为 DAG 设置的超时。

I'm setting a timeout period for the DAG using the dagrun_timeout parameter (eg 20 seconds) and I've got a task which takes 2 mins to run, but airflow marks the DAG as successful!我正在使用dagrun_timeout参数(例如 20 秒)为 DAG 设置超时期限,并且我有一个需要 2 分钟才能运行的任务,但气流将 DAG 标记为成功!

args = {
    'owner': 'me',
    'start_date': airflow.utils.dates.days_ago(2),
    'provide_context': True
}

dag = DAG('test_timeout',
          schedule_interval=None,
          default_args=args,
          dagrun_timeout=timedelta(seconds=20))

def this_passes(**kwargs):
    return

def this_passes_with_delay(**kwargs):
    time.sleep(120)
    return

would_succeed = PythonOperator(task_id='would_succeed',
                               dag=dag,
                               python_callable=this_passes,
                               email=to)

would_succeed_with_delay = PythonOperator(task_id='would_succeed_with_delay',
                            dag=dag,
                            python_callable=this_passes_with_delay,
                            email=to)

would_succeed >> would_succeed_with_delay

No error messages are thrown.不会抛出任何错误消息。 Am I using an incorrect parameter?我是否使用了错误的参数?

As stated in the source code :源代码所述

:param dagrun_timeout: specify how long a DagRun should be up before
    timing out / failing, so that new DagRuns can be created. The timeout
    is only enforced for scheduled DagRuns, and only once the
    # of active DagRuns == max_active_runs.

so this might be expected behavior as you set schedule_interval=None .所以这可能是您设置schedule_interval=None预期行为。 Here, the idea is rather to make sure a scheduled DAG won't last forever and block subsequent run intances.在这里,这个想法是为了确保预定的 DAG 不会永远持续下去并阻止后续的运行实例。

Now, you may be interested in the execution_timeout available in all operators.现在,您可能对所有运算符中可用的execution_timeout感兴趣。 For example, you could set a 60s timeout on your PythonOperator like this:例如,您可以像这样在PythonOperator上设置 60 秒超时:

would_succeed_with_delay = PythonOperator(task_id='would_succeed_with_delay',
                            dag=dag,
                            execution_timeout=timedelta(seconds=60),
                            python_callable=this_passes_with_delay,
                            email=to)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM