简体   繁体   中英

Reschedule DAG on task success/failure

Consider a very simple Apache Airflow DAG:

FileSensor -> PythonOperator

where FileSensor is waiting for some files to appear (with relatively short poke_interval ) and PythonOperator processes these files. This DAG is scheduled @once to run indefinitely - how could I set it to be rescheduled to run again from within the PythonOperator after it succeeds (or fails)?

In general I think Elad's suggestion might work, however I would argue it's a bad practice. DAGs are by design (and name) acyclic, so creating any types of loops within it might cause it to behave unexpectedly.

Also based on Airflow documentation you should set your dag schedule to None if you plan to use external dag trigger. Personally I'm not sure if it will necessarily break something, but it definitely can give you outputs you don't expect. Probably will take you longer to debug it later as well, if something goes wrong.

IMHO better approach would be for you try and to rethink your design. In case you need to reschedule dag on failure you can take advantage of reschedule mode for sensor https://www.astronomer.io/guides/what-is-a-sensor . Not sure why you would want to re-run it on success, if it's the case of multiple files in the source, I would say rather create multiple sensors with variable parameter and for loop in your dag script.

Just as @Elad suggested, TriggerDagRunOperator was the way to go. In conjunction with reset_dag_run and execution_date parameters, I was able to solve this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM