Sometimes I find it handy to create tasks using a loop.
Below is an example of a SqoopOperator of which I use the xcom value from the previous PythonOperator in the where clause. I am trying to use a variable get_delivery_sqn_task_id
to access the correct xcom value ti.xcom_pull(task_ids=get_delivery_sqn_task_id
, however this does not work (returns ()).
I can take everything out of the loop, but this makes the code quite ugly I think. Is there an elegant solution to have a variable task_ids to retrieve xcom values? I guess otherwise the best solution is using the Airflow Variables.
for table in tables:
get_delivery_sqn_task_id ='get_delivery_sqn_'+ table
get_delivery_sqn_task = PythonOperator(
task_id = get_delivery_sqn_task_id,
python_callable = get_delivery_sqn,
op_kwargs = {
'table_name': table
},
provide_context = True,
dag = dag
)
sqoop_operator_task = SqoopOperator(
task_id = "sqoop_"+table,
conn_id = "DWDH_PROD",
table = table,
cmd_type = "import",
target_dir = "/sourcedata/sqoop_tmp/"+table,
num_mappers = 1,
where = "delivery_sqn > {{ ti.xcom_pull(task_ids=get_delivery_sqn_task_id, key='return_value') }}",
dag = dag
)
You can do:
"delivery_sqn > {{{{ ti.xcom_pull(task_ids={}, key='return_value') }}}}".format(get_delivery_sqn_task_id)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.