简体   繁体   中英

"success" dag in skipped task apache airflow

when I have at least 1 skipped task in my dag it shows me that the dag was a "success" .

在此处输入图像描述

I'm using slack alerts and its integration with the airflow.

I'm looking for a function that if at least 1 task will be skipped so the dag will be "failed" (red circle) and it sends me an alert to my channel. by the way, I'm using DatabricksRunNowOperator .

thanks all!

I'm looking for a function that if at least 1 task will be skipped so the dag will be "failed" (red circle) and it sends me an alert to my channel.

I presume you need to setup to correct trigger rules between tasks, so that when at least one task fails, the whole DAG run will be at failed state.

Then you can hook up to on_failure_callback (attached at DAG level) to send alert - unless you have it done different way, then you can ignore this advice.

Airflow consider the DagRun state by the checking the leaf node of your workflow. If the leaf node is successful then your DagRun is marked as success. To achieve what you are after you will need to add a leaf task to your workflow that will check the condition you set and if met it will fail the task.

from contextlib import closing

import pendulum

from airflow import models, settings
from airflow.decorators import dag, task
from airflow.exceptions import AirflowSkipException, AirflowException
from airflow.operators.empty import EmptyOperator
from airflow.utils.state import State


@dag(dag_id="my_test_dag", start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
     schedule=None, catchup=False)
def generate_dag():
    start_task = EmptyOperator(task_id="task1")

    @task()
    def task2():
        raise AirflowSkipException

    @task()
    def task3():
        return

    @task(trigger_rule='all_done')
    def final_task(**context):
        with closing(settings.Session()) as session:
            count = session.query(
                models.TaskInstance
            ).filter(
                models.TaskInstance.dag_id == context["dag"].dag_id,
                models.TaskInstance.execution_date == context["logical_date"],
                models.TaskInstance.state == State.SKIPPED).count()
            print(f"number of tasks in skipped status is {count}")
            if count > 0:
                raise AirflowException("Number of skipped tasks in DagRun > 0")

    start_task >> [task2(), task3()] >> final_task()


dag = generate_dag()

Will give you:

在此处输入图像描述

Log for the final_task :

在此处输入图像描述

I'd like to note that to my perspective Skip is a valid status for a task. If you are treating Skip as failure this might create odd use cases like the one you are facing now which force you to find workarounds to problems you should not face to begin with.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM