简体   繁体   English

激发工作平衡所有节点

[英]spark job leverage all nodes

So my setup on AWS is 1 master node and 2 executor nodes. 因此,我在AWS上的设置是1个主节点和2个执行者节点。 I'd expect both 2 executor nodes would work on my task but I can see only one gets registered normally, the other one as ApplicationMaster. 我希望两个2个执行程序节点都可以执行我的任务,但是我只能看到一个正常注册,另一个可以注册为ApplicationMaster。 I can also see that 16 partitions at the time are processed. 我还可以看到同时处理了16个分区。

I use spark-shell for now. 我现在使用火花壳。 All the default settings, EMR 4.3. 所有默认设置为EMR 4.3。 Command to start the shell: 启动shell的命令:

export SPARK_EXECUTOR_MEMORY=20g
export SPARK_DRIVER_MEMORY=4g
spark-shell --num-executors 2 --executor-cores 16 --packages com.databricks:spark-redshift_2.10:0.6.0 --driver-java-options "-Xss100M" --conf spark.driver.maxResultSize=0

Any ideas where to start debugging this? 任何想法从哪里开始调试呢? Or is it correct behaviour? 还是正确的行为?

I think the issue is that you are running in 'cluster' mode and the spark driver is running inside an application master on one of the executor nodes, and using 1 core. 我认为问题在于您正在“集群”模式下运行,而火花驱动程序正在其中一个执行程序节点上的应用程序主控内部运行,并且使用1个内核。 Therefore because your executors require 16 cores, one of the nodes only has 15 cores available and does not have the required resources to launch a second executor. 因此,由于您的执行程序需要16个核心,因此节点之一仅具有15个可用核心,并且没有启动第二个执行程序所需的资源。 You can verify this by looking at "Nodes" in the YARN UI. 您可以通过在YARN UI中查看“节点”来验证这一点。 The solution may be to launch the spark shell in client mode --deploy-mode client or change the number of executor cores. 解决方案可能是在客户端模式--deploy-mode client启动spark shell或更改执行程序核心的数量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM