[英]How can I return lists from Python Operator in airflow and use it as argument for subsequent task in dags
I have 3 tasks to run in same dags.我有 3 个任务要在相同的 dag 中运行。 While Task1 return list of dictionary task2 and task3 try to use one dictionary element from result return by task1.
而 Task1 返回字典 task2 和 task3 的列表尝试使用 task1 返回的结果中的一个字典元素。
def get_list():
....
return listOfDict
def parse_1(example_dict):
...
def parse_2(example_dict):
...
dag = DAG('dagexample', default_args=default_args)
data_list = PythonOperator(
task_id='get_lists',
python_callable=get_list,
dag=dag)
for data in data_list:
sub_task1 = PythonOperator(
task_id='data_parse1' + data['id'],
python_callable=parse_1,
op_kwargs={'dataObject': data},
dag=dag,
)
sub_task2 = PythonOperator(
task_id='data_parse2' + data['id'],
python_callable=parse_2,
op_kwargs={'dataObject': data},
dag=dag,
)
You should use XCom for passing variables/messages between different task.您应该使用 XCom 在不同任务之间传递变量/消息。 Take a look at this example: https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_xcom.py
看看这个例子: https : //github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_xcom.py
For your case, it should be something similar as below:对于您的情况,它应该类似于以下内容:
default_args = {
'owner': 'airflow',
'start_date': airflow.utils.dates.days_ago(2),
'provide_context': True, # This is needed
}
def get_list():
....
return listOfDict
def parse_1(**kwargs):
ti = kwargs['ti']
# get listOfDict
v1 = ti.xcom_pull(key=None, task_ids='get_lists')
# You can now use this v1 dictionary as a normal python dict
...
def parse_2(**kwargs):
ti = kwargs['ti']
# get listOfDict
v1 = ti.xcom_pull(key=None, task_ids='get_lists')
...
dag = DAG('dagexample', default_args=default_args)
data_list = PythonOperator(
task_id='get_lists',
python_callable=get_list,
dag=dag)
for data in get_list():
sub_task1 = PythonOperator(
task_id='data_parse1' + data['id'],
python_callable=parse_1,
op_kwargs={'dataObject': data},
dag=dag,
)
sub_task2 = PythonOperator(
task_id='data_parse2' + data['id'],
python_callable=parse_2,
op_kwargs={'dataObject': data},
dag=dag,
)
You can use XComs as they are designed for inter-task communication.您可以使用XCom,因为它们专为任务间通信而设计。 If your dictionary is very big, then I recommend storing it as a csv file.
如果您的字典很大,那么我建议将其存储为 csv 文件。 Generally, tasks in Airflow don't share data between them, so XComs are a way to achieve them but are limited to small amounts of data.
通常,Airflow 中的任务不会在它们之间共享数据,因此 XComs 是实现它们的一种方式,但仅限于少量数据。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.