简体   繁体   中英

How to run/re-run subset of jobs AWS Glue workflow?

I am building an AWS Glue workflow composed of long-running jobs many of which are subject to failure. Is there any way I can re-run a specific branch in my workflow after a failure?

For example, my workflow looks something like this:

<Start Trigger> -> [Job 1] -> [Job 2] -> [Job 4]
       ↳ [Job 4]

Let's say [Job 1] and [Job 4] each take 3 hours and both complete successfully. Then [Job 2] is triggered but fails, leaving my workflow in this state:

<Start Trigger> -> [Job 1 ✔] -> [Job 2 ✗] -> [Job 4]
       ↳ [Job 4 ✔]

I make a change which fixes [Job 2] and believe it will run successfully when re-run. I'd like to be able to re-run only the [Job 2] -> [Job 4] branch of the workflow since all other parent jobs have completed successfully.

Is there anyway this can be done in AWS Glue? I'm considering trying to build an AWS Step Functions workflow of glue jobs as Step Functions workflows seem to have this functionality .

The ability to do this is now available since August 2020.

https://docs.aws.amazon.com/glue/latest/dg/resuming-workflow.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM