简体   繁体   English

从eclipse提交作业到Amazon EMR上正在运行的集群

[英]submit a job from eclipse to a running cluster on amazon EMR

I want to add jobs from my java code in eclipse to a running cluster of EMR for saving startup time (creating ec2, bootstrapping...). 我想将作业从eclipse中的Java代码添加到正在运行的EMR集群中,以节省启动时间(创建ec2,引导...)。

I know how to run a new cluster from java code but it's terminating after all jobs are done. 我知道如何从Java代码运行新集群,但在完成所有作业后终止。

RunJobFlowRequest runFlowRequest = new RunJobFlowRequest()
        .withName("Some name")
        .withInstances(instances)
        // .withBootstrapActions(bootstrapActions)
        .withJobFlowRole("EMR_EC2_DefaultRole")
        .withServiceRole("EMR_DefaultRole")
        .withSteps(firstJobStep, secondJobStep, thirdJobStep)
        .withLogUri("s3n://path/to/logs");

// Run the jobs
RunJobFlowResult runJobFlowResult = mapReduce
        .runJobFlow(runFlowRequest);
String jobFlowId = runJobFlowResult.getJobFlowId();

You have to set KeepJobFlowAliveWhenNoSteps parameter to TRUE , otherwise the cluster will be terminated after executing all the steps. 您必须将KeepJobFlowAliveWhenNoSteps参数设置为TRUE ,否则集群将在执行所有步骤后终止。 If this property is set, the cluster will continue in waiting state after executing all the steps. 如果设置了此属性,集群将在执行所有步骤后继续处于等待状态。

Add .withKeepJobFlowAliveWhenNoSteps(true) to the existing code. .withKeepJobFlowAliveWhenNoSteps(true)添加到现有代码。

Refer this doc for further details. 有关更多详细信息,请参阅此文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM