简体   繁体   中英

submit a job from eclipse to a running cluster on amazon EMR

I want to add jobs from my java code in eclipse to a running cluster of EMR for saving startup time (creating ec2, bootstrapping...).

I know how to run a new cluster from java code but it's terminating after all jobs are done.

RunJobFlowRequest runFlowRequest = new RunJobFlowRequest()
        .withName("Some name")
        .withInstances(instances)
        // .withBootstrapActions(bootstrapActions)
        .withJobFlowRole("EMR_EC2_DefaultRole")
        .withServiceRole("EMR_DefaultRole")
        .withSteps(firstJobStep, secondJobStep, thirdJobStep)
        .withLogUri("s3n://path/to/logs");

// Run the jobs
RunJobFlowResult runJobFlowResult = mapReduce
        .runJobFlow(runFlowRequest);
String jobFlowId = runJobFlowResult.getJobFlowId();

You have to set KeepJobFlowAliveWhenNoSteps parameter to TRUE , otherwise the cluster will be terminated after executing all the steps. If this property is set, the cluster will continue in waiting state after executing all the steps.

Add .withKeepJobFlowAliveWhenNoSteps(true) to the existing code.

Refer this doc for further details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM