[英]How can i use sensor in airflow to read from mongodb?
i am looking to develop a workflow wherein my dag kick-starts a process on remote server and monitor if the each task in process is succeeded or not, it should read the status from mongo-db and if the task is succeeded then the next task is to be triggered. 我正在寻找一个工作流程,其中我的dag在远程服务器上启动一个进程,并监视进程中的每个任务是否成功,它应该从mongo-db读取状态,如果任务成功,则下一个任务将被触发。 is there any way i can achieve it? 有什么办法可以实现? i think i should use a mongo_sensor but not sure how to use that. 我认为我应该使用mongo_sensor,但不确定如何使用。
i have successfully read the mongodb using this code. 我已经使用此代码成功读取了mongodb。
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta
import pendulum
local_tz = pendulum.timezone("Europe/Amsterdam")
def function1():
print("hello")
import pymongo
from pymongo import MongoClient
client=MongoClient("mongodb://rpa_task:rpa_task123@ds141641.mlab.com:41641/rpa_task")
mydb = client['rpa_task']
collect2 = mydb['business_process_mgts']
cursor=collect2.find({"process.id":"ross1335_testingpurchase_1915"})
for i in cursor:
print(i['sequenceFlow'])
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2019, 07, 8, tzinfo=local_tz),
'email': ['shubhamkalyankari01@gmail.com'],
'email_on_failure': True,
'email_on_retry': True,
'retries': 3,
'schedule_interval': '@hourly',
'retry_delay': timedelta(seconds=5),
}
dag = DAG('mongo1.py', default_args=default_args)
t1=PythonOperator(dag=dag,
task_id='t1',
provide_context=False,
python_callable=function1,)
it is reading the mongo documents successfully.
You have started well. 您一切顺利。 From here you can take 2 approaches 从这里您可以采取2种方法
Continue using PythonOperator
: You can very well modify your python_callable
to accommodate the logic that you want: 继续使用PythonOperator
:您可以很好地修改python_callable
以适应所需的逻辑:
sleep - wakeup - test condition met? 睡眠-唤醒-满足测试条件? - flag success / go back to sleep -标记成功/回去睡觉
Sensor
: Extend Airflow's BaseSensorOperator
to define that same sleeping / waiting logic in the poke()
function . 使用自定义Sensor
:扩展Airflow的BaseSensorOperator
来在poke()
函数中定义相同的睡眠/等待逻辑。 If you plan to use sensor, do make sure to have a look at mode
param so that you don't end up with a deadlocked DAG 如果您打算使用传感器,请确保查看一下mode
参数,以免最终陷入僵局
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.