简体   繁体   English

如何在气流中使用传感器从mongodb读取数据?

[英]How can i use sensor in airflow to read from mongodb?

i am looking to develop a workflow wherein my dag kick-starts a process on remote server and monitor if the each task in process is succeeded or not, it should read the status from mongo-db and if the task is succeeded then the next task is to be triggered. 我正在寻找一个工作流程,其中我的dag在远程服务器上启动一个进程,并监视进程中的每个任务是否成功,它应该从mongo-db读取状态,如果任务成功,则下一个任务将被触发。 is there any way i can achieve it? 有什么办法可以实现? i think i should use a mongo_sensor but not sure how to use that. 我认为我应该使用mongo_sensor,但不确定如何使用。

i have successfully read the mongodb using this code. 我已经使用此代码成功读取了mongodb。

from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta

import pendulum

local_tz = pendulum.timezone("Europe/Amsterdam")

def function1():
    print("hello")
    import pymongo
    from pymongo import MongoClient
    client=MongoClient("mongodb://rpa_task:rpa_task123@ds141641.mlab.com:41641/rpa_task")

    mydb = client['rpa_task']
    collect2 = mydb['business_process_mgts']
    cursor=collect2.find({"process.id":"ross1335_testingpurchase_1915"})
    for i in cursor:
        print(i['sequenceFlow'])



default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime(2019, 07, 8,  tzinfo=local_tz),
    'email': ['shubhamkalyankari01@gmail.com'],
    'email_on_failure': True,
    'email_on_retry': True,
    'retries': 3,

        'schedule_interval': '@hourly',
    'retry_delay': timedelta(seconds=5),
}

dag = DAG('mongo1.py', default_args=default_args)

t1=PythonOperator(dag=dag,
     task_id='t1',
     provide_context=False,
     python_callable=function1,)


it is reading the mongo documents successfully.

You have started well. 您一切顺利。 From here you can take 2 approaches 从这里您可以采取2种方法

  1. Continue using PythonOperator : You can very well modify your python_callable to accommodate the logic that you want: 继续使用PythonOperator :您可以很好地修改python_callable以适应所需的逻辑:

    sleep - wakeup - test condition met? 睡眠-唤醒-满足测试条件? - flag success / go back to sleep -标记成功/回去睡觉


  1. Use a custom Sensor : Extend Airflow's BaseSensorOperator to define that same sleeping / waiting logic in the poke() function . 使用自定义Sensor :扩展Airflow的BaseSensorOperator来在poke()函数中定义相同的睡眠/等待逻辑。 If you plan to use sensor, do make sure to have a look at mode param so that you don't end up with a deadlocked DAG 如果您打算使用传感器,请确保查看一下mode参数,以免最终陷入僵局

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM