简体   繁体   English

在气流 Dag 中使用 dag_run 变量

[英]Using dag_run variables in airflow Dag

I am trying to use airflow variables to determine whether to execute a task or not.我正在尝试使用气流变量来确定是否执行任务。 I have tried this and it's not working:我试过这个,但它不起作用:

if '{{ params.year }}' == '{{ params.message }}':
     run_this = DummyOperator (
                task_id = 'dummy_dag'
               )

I was hoping to get some help making it work.我希望能得到一些帮助让它发挥作用。 Also is there a better way of doing something like this in airflow?还有在气流中做这样的事情的更好方法吗?

I think a good way to solve this, is with BranchPythonOperator to branch dynamically based on the provided DAG parameters.我认为解决这个问题的一个好方法是使用BranchPythonOperator根据提供的 DAG 参数动态分支。 Consider this example:考虑这个例子:

Use params to provide the parameters to the DAG (could be also done from the UI), in this example: {"enabled": True}使用params向 DAG 提供参数(也可以从 UI 完成),在本例中: {"enabled": True}

from airflow.decorators import dag, task
from airflow.utils.dates import days_ago
from airflow.operators.python import get_current_context, BranchPythonOperator

@dag(
    default_args=default_args,
    schedule_interval=None,
    start_date=days_ago(1),
    catchup=False,
    tags=["example"],
    params={"enabled": True},
)
def branch_from_dag_params():
    def _print_enabled():
        context = get_current_context()
        enabled = context["params"].get("enabled", False)
        print(f"Task id: {context['ti'].task_id}")
        print(f"Enabled is: {enabled}")

    @task
    def task_a():
        _print_enabled()

    @task
    def task_b():
        _print_enabled()

Define a callable to the BranchPythonOperator in which you will perform your conditionals and return the next task to be executed.定义一个可调用的BranchPythonOperator ,您将在其中执行条件并返回要执行的下一个任务。 You can access the execution context variables from **kwargs .您可以从**kwargs访问执行上下文变量。 Also keep in mind that this operator should return a single task_id or a list of task_ids to follow downstream.另请记住,此运算符应返回单个task_idtask_id 列表以跟随下游。 Those resultant tasks should always be directly downstream from it.这些由此产生的任务应该总是直接在它的下游。

    def _get_task_run(ti, **kwargs):
        custom_param = kwargs["params"].get("enabled", False)

        if custom_param:
            return "task_a"
        else:
            return "task_b"

    branch_task = BranchPythonOperator(
        task_id="branch_task",
        python_callable=_get_task_run,
    )
    task_a_exec = task_a()
    task_b_exec = task_b()
    branch_task >> [task_a_exec, task_b_exec]

The result is that task_a gets executed and task_b is skipped :结果是task_a被执行而task_b跳过

从 dag 参数分支

AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=branch_from_dag_params
AIRFLOW_CTX_TASK_ID=task_a
Task id: task_a
Enabled is: True

Let me know if that worked for you.如果这对你有用,请告诉我。

Docs 文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果手动运行 DAG,Airflow 计划的 dag_run 时间会更改 - Airflow scheduled dag_run time changes if DAG is run manuallly 是否可以更新/覆盖 Airflow ['dag_run'].conf? - Is it possible to update/overwrite the Airflow [‘dag_run’].conf? 基于 dag_run conf 值的循环中的气流任务 - Airflow tasks in a loop based on dag_run conf value 如何解码气流表 dag_run 列 conf 值 - How to decode airflow table dag_run column conf value 气流配置数据库表dag_run中的end_date列为null - airflow metabase the column end_date in table dag_run is null 气流 - 如何获取当前 dag_run 的开始日期(不是特定任务)? - airflow - how to get start date of current dag_run (not specific task)? Airflow - 在通过 TriggerDagRunOperator 发送之前设置 dag_run conf 值 - Airflow - Set dag_run conf values before sending them through TriggerDagRunOperator 有些人可以为我提供在气流数据库中重新创建 dag_run 表的模式吗? - Can some provide me with the schema to recreate dag_run table in airflow-db.? Apache Airflow-如何在使用TriggerDagRunOperator触发的流中在运算符外部检索dag_run数据 - Apache Airflow - How to retrieve dag_run data outside an operator in a flow triggered with TriggerDagRunOperator 在 Airflow 中每小时运行 dag - Hourly run dag in Airflow
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM