简体   繁体   English

如何在 Airflow 上重新启动失败的任务

[英]How to restart a failed task on Airflow

I am using a LocalExecutor and my dag has 3 tasks where task(C) is dependant on task(A).我正在使用LocalExecutor并且我的 dag 有3 个任务,其中任务(C)依赖于任务(A)。 Task(B) and task(A) can run in parallel something like below任务(B)和任务(A)可以像下面这样并行运行

A-->C A-->C

B

So task(A) has failed and but task(B) ran fine .所以 task(A) 失败了,但task(B) 运行良好 Task(C) is yet to run as task(A) has failed.由于任务(A) 失败,任务(C) 尚未运行。

My question is how do i re run Task(A) alone so Task(C) runs once Task(A) completes and Airflow UI marks them as success.我的问题是我如何单独运行 Task(A) 以便 Task(C) 在Task(A) 完成并且 Airflow UI 将它们标记为成功后运行。

In the UI:在用户界面中:

  1. Go to the dag, and dag run of the run you want to change转到要更改的运行的 dag 和 dag run
  2. Click on GraphView单击图形视图
  3. Click on task A点击任务A
  4. Click "Clear"点击“清除”

This will let task A run again, and if it succeeds, task C should run.这会让任务 A 再次运行,如果成功,任务 C 应该运行。 This works because when you clear a task's status, the scheduler will treat it as if it hadn't run before for this dag run.这是有效的,因为当您清除任务的状态时,调度程序会将它视为之前没有运行过此 dag 运行。

Here's an alternate solution where you can have it clear and retry certain tasks automatically.这是一个替代解决方案,您可以在其中清除并自动重试某些任务。 If you only want to clear a certain task, you would not use the -d (downstream) flag:如果您只想清除某个任务,则不会使用 -d(下游)标志:

from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.bash_operator import BashOperator
from datetime import datetime, timedelta


def clear_upstream_task(context):
    execution_date = context.get("execution_date")
    clear_tasks = BashOperator(
        task_id='clear_tasks',
        bash_command=f'airflow tasks clear -s {execution_date}  -t t1 -d -y clear_upstream_task'
    )
    return clear_tasks.execute(context=context)


# Default settings applied to all tasks
default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(seconds=5)
}


with DAG('clear_upstream_task',
         start_date=datetime(2021, 1, 1),
         max_active_runs=3,
         schedule_interval=timedelta(minutes=5),
         default_args=default_args,
         catchup=False
         ) as dag:
    t0 = DummyOperator(
        task_id='t0'
    )

    t1 = DummyOperator(
        task_id='t1'
    )

    t2 = DummyOperator(
        task_id='t2'
    )
    t3 = BashOperator(
        task_id='t3',
        bash_command='exit 123',
        on_failure_callback=clear_upstream_task
    )

    t0 >> t1 >> t2 >> t3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM