简体   繁体   English

Apache Airflow 如何将 xcom_pull() 值转换为 DAG?

[英]Apache Airflow How to xcom_pull() value into a DAG?

I have a custom operator which pushes XCOM value as below:我有一个自定义运算符,它推送 XCOM 值,如下所示:

...
task_instance = context['task_instance']
task_instance.xcom_push("list_of_files",file_list)
...

It works fine.它工作正常。 I have a dag definition file (my_dag.py) where I create a task by using my own operator, it pushes the XCOM value then I want to do for in loop by using this xcom value.我有一个 dag 定义文件 (my_dag.py),我在其中使用我自己的运算符创建了一个任务,它会推送 XCOM 值,然后我想使用此 xcom 值在循环中执行 for。 How to pull it?怎么拉?

You can't access the XCOM variable in your dag, it is only available in operators by supplying the provide_context=True argument to the operators constructor.您无法访问 dag 中的 XCOM 变量,它只能通过向运算符构造函数提供provide_context=True参数在运算符中使用。

In the case where you want to use data from an operator in your DAG structure itself, you would need to perform the actual task your operator is performing outisde of an operator.如果您想在 DAG 结构本身中使用来自运算符的数据,则需要在运算符之外执行运算符正在执行的实际任务。

def get_file_list():
    hook = SomeHook()
    hook.run('something to get file list')

dag = DAG('tutorial', default_args=default_args)

for file in get_file_list():
    task = SomeOperator(params={'file': file}) # Do something with the file passed as a parameter

It is generally bad practice to access xcom from the dag itself rather than from a task in the dag.从 dag 本身而不是从 dag 中的任务访问 xcom 通常是不好的做法。 That said, sometimes it is necessary.也就是说,有时这是必要的。 For example, you may need to do this when dynamically creating dags.例如,您可能需要在动态创建 dag 时执行此操作。

Here is an example of me pulling some unrun jobs within a dag.这是我在 dag 中提取一些未运行作业的示例。 I'm using this in the context of a subdag, so I can rest assured that the xcom will always contain the information assuming the method is running.我在 subdag 的上下文中使用它,所以我可以放心,假设该方法正在运行,xcom 将始终包含信息。

    xcom_unrun_jobs = None
    if len(parent_dag.get_active_runs()) > 0:
        tis = parent_dag.get_task_instances(settings.Session, start_date=parent_dag.get_active_runs()[-1])[-1]
        xcom_unrun_jobs = tis.xcom_pull(dag_id=parent_dag._dag_id, task_ids=unrun_job_task_id)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM