简体   繁体   English

使用spark-submit部署应用程序:应用程序已添加到调度程序中,尚未激活

[英]Deploying application with spark-submit: Application is added to the scheduler and is not yet activated

I have VirtualBox with Linux Centos 12G memory. 我有带有Linux Centos 12G内存的VirtualBox。 I need to deploy 2 applications to the hadoop running in non-distributed configuration. 我需要将2个应用程序部署到以非分布式配置运行的hadoop。 This is my YARN config: 这是我的YARN配置:

<configuration>
<property>
    <name>yarn.nodemanager.pmem-check-enabled</name>
    <value>false</value>
</property>
<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
</property>
<property>
   <name>yarn.resourcemanager.address</name>
   <value>0.0.0.0:8032</value>
</property>
<property>
  <name>yarn.scheduler.maximum-allocation-vcores</name>
  <value>130</value>
</property>
<property>
   <name>yarn.nodemanager.vmem-check-enabled</name>
   <value>false</value>
   <description>Whether virtual memory limits will be enforced for containers</description>
</property>
<property>
   <name>yarn.scheduler.maximum-allocation-mb</name>
   <value>4048</value>
</property>
<property>
   <name>yarn.nodemanager.vmem-pmem-ratio</name>
   <value>1</value>
   <description>Ratio between virtual memory to physical memory when
setting memory limits for containers</description>
</property>
<property>
   <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
   <value>1</value>
</property>
</configuration>

I deploy first application and it runs correctly: 我部署了第一个应用程序,它可以正常运行:

spark-submit --master yarn --deploy-mode client --name OryxBatchLayer-ALSExample --class com.cloudera.oryx.batch.Main --files oryx.conf --driver-memory 500m --driver-java-options "-Dconfig.file=oryx.conf" --executor-memory 500m --executor-cores 1 --conf spark.executor.extraJavaOptions="-Dconfig.file=oryx.conf" --conf spark.ui.port=4040 --conf spark.io.compression.codec=lzf --conf spark.logConf=true --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.speculation=true --conf spark.ui.showConsoleProgress=false --conf spark.dynamicAllocation.enabled=false --num-executors=1 oryx-batch-2.8.0-SNAPSHOT.jar spark-submit --master yarn --deploy-mode client --name OryxBatchLayer-ALSExample --class com.cloudera.oryx.batch.Main --files oryx.conf --driver-memory 500m --driver-java-options “ -Dconfig.file = oryx.conf” --executor内存500m --executor-cores 1 --conf spark.executor.extraJavaOptions =“-Dconfig.file = oryx.conf” --conf spark.ui.port = 4040 --conf spark.io.compression.codec = lzf --conf spark.logConf = true --conf spark.serializer = org.apache.spark.serializer.KryoSerializer --conf spark.speculation = true --conf spark。 ui.showConsoleProgress = false --conf spark.dynamicAllocation.enabled = false --num-executors = 1 oryx-batch-2.8.0-SNAPSHOT.jar

YARN manager at 8088 indicates that I'm using 2 of 8 vcores and 2 of 8g memory: 在8088的YARN管理器指示我正在使用8个vcore中的2个和8g中的2个内存:

在此处输入图片说明

Now I deploy my second application: 现在,我部署第二个应用程序:

spark-submit --master yarn --deploy-mode client --name OryxSpeedLayer-ALSExample --class com.cloudera.oryx.speed.Main --files oryx.conf --driver-memory 500m --driver-java-options "-Dconfig.file=oryx.conf" --executor-memory 500m --executor-cores 1 --conf spark.executor.extraJavaOptions="-Dconfig.file=oryx.conf" --conf spark.ui.port=4041 --conf spark.io.compression.codec=lzf --conf spark.logConf=true --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.speculation=true --conf spark.ui.showConsoleProgress=false --conf spark.dynamicAllocation.enabled=false --num-executors=1 oryx-speed-2.8.0-SNAPSHOT.jar spark-submit --master yarn --deploy-mode client --name OryxSpeedLayer-ALSExample --class com.cloudera.oryx.speed.Main --files oryx.conf --driver-memory 500m --driver-java-options “ -Dconfig.file = oryx.conf” --executor内存500m --executor-cores 1 --conf spark.executor.extraJavaOptions =“-Dconfig.file = oryx.conf” --conf spark.ui.port = 4041 --conf spark.io.compression.codec = lzf --conf spark.logConf = true --conf spark.serializer = org.apache.spark.serializer.KryoSerializer --conf spark.speculation = true --conf spark。 ui.showConsoleProgress = false --conf spark.dynamicAllocation.enabled = false --num-executors = 1 oryx-speed-2.8.0-SNAPSHOT.jar

but this time I get a warning, also it seems that second application is frozen, at least it doesn't allocate memory: 但是这次我收到警告,似乎第二个应用程序已冻结,至少它没有分配内存:

2018-08-06 04:49:10 INFO Client:54 - client token: N/A diagnostics: [Mon Aug 06 04:49:09 -0400 2018] Application is added to the scheduler and is not yet activated. 2018-08-06 04:49:10 INFO客户端:54-客户端令牌:N / A诊断:[Mon Aug 06 04:49:09 -0400 2018]应用程序已添加到调度程序中,尚未激活。 Queue's AM resource limit exceeded. 超出了队列的AM资源限制。 Details : AM Partition = ; 详细信息:AM分区=; AM Resource Request = ; AM资源请求=; Queue Resource Limit for AM = ; AM =的队列资源限制; User AM Resource Limit of the queue = ; 队列的用户AM资源限制=; Queue AM Resource Usage = ; 队列AM资源使用率=; ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1533545349902 final status: UNDEFINED tracking URL: http://master:8088/proxy/application_1533542648791_0002/ user: osboxes ApplicationMaster主机:N / A ApplicationMaster RPC端口:-1队列:默认开始时间:1533545349902最终状态:未定义的跟踪URL: http:// master:8088 / proxy / application_1533542648791_0002 /用户:osboxes

在此处输入图片说明

What is the rootcause of the problem? 问题的根本原因是什么? How can I increase Queue Resource Limit for AM and User AM Resource Limit of the queue ? 如何增加队列的AM的队列资源限制队列的 User AM资源限制

The fix was to edit 解决方法是编辑

~/hadoop-3.1.0/etc/hadoop/capacity-scheduler.xml 〜/ Hadoop的3.1.0的/ etc / hadoop的/能力scheduler.xml

and update .1 to 1: 并将.1更新为1:

<property>
    <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
    <value>1</value>
    <description>
      Maximum percent of resources in the cluster which can be used to run
      application masters i.e. controls number of concurrent running
      applications.
    </description>
  </property>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM