Is there a way I can skip remaining tasks within a dag if s3sensor operator is not able to find the file in s3 location.
I know that we can use shortcircuit operator to skip the task, but I am looking for a way in which I can integrate shortcircuit and s3sensor operator together. If not, then is there any other way I can achieve that? Thank you in advance
Assuming you mean that the sensor operator fails due to a timeout when you say "is not able to find the file in s3 location", there is an implicit way to handle this through the Airflow Scheduler.
You can use the soft_fail
parameter available to all of the sensor operators that inherit from the BaseSensorOperator (which all of the S3 sensors do). If soft_fail
is set to True, the sensor operator will be set to a "skipped" state when the sensor fails for a timeout rather than being set to a state of "failed".
If the downstream tasks do not have a trigger_rule
to allowing them to execute when an upstream task is set to "skipped" (the default trigger_rule
is "all_success" -- more on Trigger Rules here ), the Scheduler will mark the downstream tasks as "skipped" as well. This will continue to propagate down the DAG.
Basically the Scheduler looks at the downstream tasks' trigger_rule
and sees that they should not run because upstream tasks were "skipped" and then subsequently skips the tasks.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.