简体   繁体   English

气流 - 定义xcom_push函数的键值

[英]Airflow - Defining the key,value for a xcom_push function

I am trying to pass a Python function in Airflow. 我试图在Airflow中传递Python函数。 I am not sure what the key and values should be for a xcom_push function. 我不确定xcom_push函数的关键和值应该是什么。 Could anyone assist on this. 谁能帮助解决这个问题。 Thanks 谢谢

def db_log(**context):
  db_con = psycopg2.connect(" dbname = 'name' user = 'user' password = 'pass' host = 'host' port = '5439' sslmode = 'require' ")
  task_instance = context['task_instance']
  task_instance.xcom_push(key=db_con, value = db_log)
  return (db_con)

Could anyone assist in getting the correct key and value for the xcom_push function. 任何人都可以帮助获取xcom_push函数的正确键和值。 Thanks.. 谢谢..

In examples the correct way of calling can be found, eg: https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_xcom.py 在示例中,可以找到正确的调用方式,例如: https//github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_xcom.py

So here it should be 所以这应该是

task_instance.xcom_push(key=<string identifier>, value=<actual value / object>)

In your case 在你的情况下

task_instance.xcom_push(key="db_con", value=db_con)

Refer the below example: 请参考以下示例:

Hope that would help. 希望这会有所帮助。

args = {
    'owner': 'airflow',
    'start_date': start_date
}

dag = DAG(dag_id = 'test_dag', schedule_interval=None, default_args=args)
y = 0

def LoadYaml(**kwargs):
        y = 'df-12345567789'
        kwargs['ti'].xcom_push(key='name',value=y)
        return True

def CreatePipeLine(**kwargs):
        print("I m client")

def ActivatePipeline(client,pipelineId):
        print("activated", client, pipelineId)

start_task = DummyOperator(task_id='Start_Task', dag=dag)

LoadYaml_task = ShortCircuitOperator(task_id='LoadYaml_task',provide_context=True,python_callable=LoadYaml,dag=dag)

start_task.set_downstream(LoadYaml_task)

CreatePipeLine_task = ShortCircuitOperator(task_id='CreatePipeLine_task',provide_context=True,python_callable=CreatePipeLine,op_kwargs = {'client' : 'HeyImclient'},dag=dag)

LoadYaml_task.set_downstream(CreatePipeLine_task)

ActivatePipeline_task= ShortCircuitOperator(task_id='ActivatePipeline_task',provide_context=True,python_callable=ActivatePipeline,op_kwargs = {'client' : 'You','pipelineId' : '1234'},dag=dag)

CreatePipeLine_task.set_downstream(ActivatePipeline_task)

This is a bit old, but from what I understand, if you are running db_log as a task, then returning db_con would automatically push it to the xcom. 这有点旧,但据我所知,如果您将db_log作为任务运行,则返回db_con会自动将其推送到xcom。

You could then access it with {{ti.xcom_pull(task_ids='TASK_NAME_HERE')}} 然后,您可以使用{{ti.xcom_pull(task_ids='TASK_NAME_HERE')}}访问它

Instead of using xcom to connect to your DB I would recommend you use Connections : https://airflow.apache.org/howto/connection/index.html 我建议您使用Connections,而不是使用xcom连接到您的数据库: https ://airflow.apache.org/howto/connection/index.html

Start by setting a connection to connect to your DB either from the command line with : 首先设置连接以从命令行连接到您的数据库:

airflow connections -a --conn_id postgres_custom --conn_host <your-host> --conn_type postgres --conn_port 1234 --conn_login <username> --conn_password <password> --conn_extra {"sslmode": "require"}

Or directly from the UI. 或直接从UI。 Here is some documentation on how to set up a postgres connection in airflow (works with other DB types as well): https://airflow.apache.org/howto/connection/postgres.html 以下是有关如何在airflow中设置postgres连接的一些文档(也适用于其他数据库类型): https//airflow.apache.org/howto/connection/postgres.html

Then you can query your database with some DAG : 然后,您可以使用某个DAG查询数据库:

DAG_ARGS = {
    'owner': 'airflow',
    'start_date': airflow.utils.dates.days_ago(2),
}


DAG_ID = "Dummy_DAG"


with DAG(dag_id=DAG_ID,
         default_args=DAG_ARGS,
         schedule_interval=None) as dag:

    query_1 = PostgresOperator(
        task_id='POSTGRES_QUERY',
        postgres_conn_id='postgres_custom',
        sql= """SELECT COUNT(*) FROM TABLE A""",
        database="my-db",
        dag=dag,
    )

    query_2 = PostgresOperator(
        task_id='POSTGRES_QUERY_2',
        postgres_conn_id='postgres_custom',
        sql="""SELECT COUNT(*) FROM TABLE B""",
        database="my-db",
        dag=dag,
    )

    query_1 >> query_2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM