简体   繁体   English

Airflow:ExternalTaskSensor 无法按预期工作。 不同的任务计划

[英]Airflow: ExternalTaskSensor doesn't work as expected. Different task schedules

Colleagues, we need help.同事,我们需要帮助。 There are two dags Parent and Child, parent has its own schedule, suppose '30 * * * * ', child '1 8-17 * * 1-5', child waits for parent to execute, for example 40 minutes, if parent ends with error, then child also crashes with an error, otherwise the next task of the child class is executed. dag有Parent和Child,parent有自己的schedule,假设'30 * * * *',child '1 8-17 * * 1-5',child等待parent执行,比如40分钟,如果parent以错误结束,然后子进程也因错误而崩溃,否则执行子进程 class 的下一个任务。 The problem is that this does not work even in the simplest case, I don't understand how to synchronize them.问题是即使在最简单的情况下这也不起作用,我不明白如何同步它们。 I wrote code like this:我写了这样的代码:

Dag Parent达格父母

import time

from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.sensors.external_task_sensor  import ExternalTaskSensor, ExternalTaskMarker

start_date =  datetime(2021, 3, 1, 20, 36, 0)

class Exept(Exception):
    pass

def wait():
    time.sleep(3)
    with open('etl.txt', 'r') as txt:
        line = txt.readline()
        if line == 'err':
            print(1)
            raise Exept
    return 'etl success'


with DAG(
    dag_id="dag_etl1",
    start_date=start_date,
    schedule_interval='* * * * *',
    tags=['example2'],
) as etl1:
    parent_task = ExternalTaskMarker(
        task_id="parent_task",
        external_dag_id="dag_etl1",
        external_task_id="etl_child",
    )
    wait_timer = PythonOperator(task_id='wait_timer', python_callable=wait)
    
    wait_timer >> parent_task

Dag child达格孩子

from datetime import datetime, timedelta


from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.sensors.external_task_sensor  import ExternalTaskSensor, ExternalTaskMarker

from etl_parent import etl1, wait_timer, parent_task

start_date =  datetime(2021, 3, 1, 20, 36, 0)

def check():
    return 'I succeeded'

with DAG(
    dag_id='etl_child', 
    start_date=start_date, 
    schedule_interval='* * * * *',
    tags = ['testing_child_dag']
) as etl_child:
    status = ExternalTaskSensor(
        task_id="dag_etl1",
        external_dag_id=etl1.dag_id,
        external_task_id=parent_task.task_id,
        allowed_states=['success'],
        mode='reschedule',
        execution_delta=timedelta(minutes=1),
        timeout=60,
    )

    task1 = PythonOperator(task_id='task1', python_callable=check)
    
    status >> task1

As you can see, I'm trying to emulate the situation when the parent task fails if err is specified in the text file and succeeds in any other case.如您所见,如果在文本文件中指定 err 并在任何其他情况下成功,则我正在尝试模拟父任务失败的情况。 But this does not work at all as I expect, at the first start of the dag everything is fine, it works correctly, if I change the data in the text file, then the parent task works correctly, for example, I launch the parent dag with a knowingly error, everything will work correctly, the child class will end with an error, but if I change the text, again the parent will work correctly, but the child will continue to fall for a while, then it may be correct, but not a fact.但这根本不像我预期的那样工作,在 dag 的第一次启动时一切都很好,它工作正常,如果我更改文本文件中的数据,那么父任务正常工作,例如,我启动父dag 有一个明知的错误,一切都会正常工作,孩子 class 将以错误结束,但如果我更改文本,父母将再次正常工作,但孩子会继续跌倒一段时间,那么它可能是正确的,但不是事实。 If the launch is known to be successful, the situation is the same, exactly the opposite.如果已知发射成功,情况是一样的,正好相反。 Also, I do not understand how to organize the waiting for the task to be completed in the parent dag.另外,我不明白如何在父 dag 中组织等待任务完成。

Please help) I have been working with airflow recently, I may be missing something.请帮助)我最近一直在使用 airflow,我可能遗漏了一些东西。

The most common cause of problems with ExternalTaskSensor is with execution_delta parameter, so I would start there. ExternalTaskSensor 问题的最常见原因是 execution_delta 参数,所以我将从那里开始。

I see that both parent and child dag have exactly the same start_date and schedule_interval, yet your execution_delta is 1 minute.我看到父 dag 和子 dag 的 start_date 和 schedule_interval 完全相同,但您的 execution_delta 是 1 分钟。 In this case your child dag looks for a parent dag that started at 20:35 (in your example), but it actually started at 20:36, so it fails.在这种情况下,您的子 dag 会查找从 20:35(在您的示例中)开始的父 dag,但它实际上是在 20:36 开始的,所以它失败了。 Even if for the test try to set you parent dag to start at 20:35 and see if it solves the problem.即使测试尝试将您的父 dag 设置为从 20:35 开始,看看它是否能解决问题。

Here's a good article that goes into a bit more detail about schedule_interval pitfall https://medium.com/@fninsiima/sensing-the-completion-of-external-airflow-tasks-827344d03142这是一篇很好的文章,详细介绍了 schedule_interval 陷阱https://medium.com/@fninsiima/sensing-the-completion-of-external-airflow-tasks-827344d03142

In regards to waiting time, that's your timeout parameter in ExternalTaskSensor.关于等待时间,这是您在 ExternalTaskSensor 中的超时参数。 In your case it waits 60 seconds before it fails.在您的情况下,它会在失败前等待 60 秒。 Personally, I'll be very cautious with setting a long timeout period.就个人而言,我会非常谨慎地设置较长的超时时间。 While your sensor is waiting it occupies a worker so no other tasks can use it, which can result in your other tasks being locked from execution, especially if you have a lot of sensors.当您的传感器正在等待时,它会占用一个工作人员,因此没有其他任务可以使用它,这可能会导致您的其他任务被锁定而无法执行,尤其是在您有很多传感器的情况下。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM