I am using Java EMR API to run pig job on EMR cluster. I am using following code to add Steps in JobFLow:
String jobFlowId = "j-assdasd";
AmazonElasticMapReduceClient client = new AmazonElasticMapReduceClient(
credentials);
StepFactory stepFactory = new StepFactory();
StepConfig executePig = new StepConfig()
.withName("Execute Pig")
.withActionOnFailure(ActionOnFailure.CANCEL_AND_WAIT)
.withHadoopJarStep(
stepFactory
.newRunPigScriptStep("s3://bucket/script/load.pig"));
AddJobFlowStepsRequest pig = new AddJobFlowStepsRequest(jobFlowId)
.withSteps( executePig);
AddJobFlowStepsResult result = client.addJobFlowSteps(pig);
How can i get the status of the "Execute Pig" status? I want to make program wait till the step finishes on EMR.
I found a way to do it java:
List<String> id = result.getStepIds();
DescribeStepResult res = client.describeStep(new DescribeStepRequest().withStepId(id.get(0)));
StepStatus status = res.getStep().getStatus();
String stas = status.getState();
But, here we need to loop on status till its return completed.
As Ajay mentioned on his own answer, there is a need for a loop to constantly check the statuses of the cluster, bootstrap actions, and steps. This post shows how to create such loop to keep the program inside of it until a certain status is reached.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.