spark job leverage all nodes

Question

So my setup on AWS is 1 master node and 2 executor nodes. I'd expect both 2 executor nodes would work on my task but I can see only one gets registered normally, the other one as ApplicationMaster. I can also see that 16 partitions at the time are processed.

I use spark-shell for now. All the default settings, EMR 4.3. Command to start the shell:

export SPARK_EXECUTOR_MEMORY=20g
export SPARK_DRIVER_MEMORY=4g
spark-shell --num-executors 2 --executor-cores 16 --packages com.databricks:spark-redshift_2.10:0.6.0 --driver-java-options "-Xss100M" --conf spark.driver.maxResultSize=0

Any ideas where to start debugging this? Or is it correct behaviour?

Answer 1

I think the issue is that you are running in 'cluster' mode and the spark driver is running inside an application master on one of the executor nodes, and using 1 core. Therefore because your executors require 16 cores, one of the nodes only has 15 cores available and does not have the required resources to launch a second executor. You can verify this by looking at "Nodes" in the YARN UI. The solution may be to launch the spark shell in client mode --deploy-mode client or change the number of executor cores.

spark job leverage all nodes

Question

1 answers

solution1
1 ACCPTED 2016-02-29 14:27:17

spark job leverage all nodes

Question

1 answers

solution1 1 ACCPTED 2016-02-29 14:27:17

solution1
1 ACCPTED 2016-02-29 14:27:17