简体   繁体   中英

Skip the remaining tasks within airflow Dag if s3senors is not able to find the file

Is there a way I can skip remaining tasks within a dag if s3sensor operator is not able to find the file in s3 location.

I know that we can use shortcircuit operator to skip the task, but I am looking for a way in which I can integrate shortcircuit and s3sensor operator together. If not, then is there any other way I can achieve that? Thank you in advance

Assuming you mean that the sensor operator fails due to a timeout when you say "is not able to find the file in s3 location", there is an implicit way to handle this through the Airflow Scheduler.

You can use the soft_fail parameter available to all of the sensor operators that inherit from the BaseSensorOperator (which all of the S3 sensors do). If soft_fail is set to True, the sensor operator will be set to a "skipped" state when the sensor fails for a timeout rather than being set to a state of "failed".

If the downstream tasks do not have a trigger_rule to allowing them to execute when an upstream task is set to "skipped" (the default trigger_rule is "all_success" -- more on Trigger Rules here ), the Scheduler will mark the downstream tasks as "skipped" as well. This will continue to propagate down the DAG.

Basically the Scheduler looks at the downstream tasks' trigger_rule and sees that they should not run because upstream tasks were "skipped" and then subsequently skips the tasks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM