简体   繁体   中英

Java Client for AWS EMR, listSteps doesn't show the latest step

I'm running a Java job that start AWS EMR and run steps on it. After I add a step to the EMR I call the listSteps function to get the status of the steps and wait until they all done/failed.

I noticed that sometimes the function listSteps doesn't included the last step I added if I call it right after I added it. Which makes me think that all the steps are done while actually the latest step didn't even started.

  1. Is that a known issue or am I missing anything here?
  2. Is there a "best practice" to avoid this except "sleeping" before calling listSteps ?

I'm use the "AmazonElasticMapReduceClient" class from Amazon SDK.

I don't think there is a magic workaround for this kind of problem. Many of AWS calls are asynchronous. For example, launching an EC2 machine will return right away, and then you must poll to see if the instance is up yet. I think with a bit of design, it won't be much of an issue. I see several options:

When you create the cluster and add the job steps, you know how many job steps, and which job steps you're adding to the cluster, so you can start a new thread and monitor the cluster for all steps being added (in psuedocode):

function createCluster(steps, callback):
    aws.runJobFlow(...)
    on new thread:
        while(steps != aws.listSteps(...)):
            sleep()
        callback()

Then all you have to do in your status check (to see if job has finished) is to call listSteps() and check the status. That's probably the simplest solution to the problem.

The other design option is that you have a job step that notifies your software of progress or successful completion of the job. This design option would be asynchronous and wouldn't require polling. For example, create a job step called notify . Then you run your steps like

  1. JobStep1
  2. Notify
  3. JobStep2
  4. Notify

Each notify step can listSteps() on the job flow to see the result of the previous steps and update a database, send a message to a service, or update a cache with the progress of the job.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM