spark-submit: Difference between “ --master local[n]” and “--master local --executor-cores m”

Question

I have a dual-core machine (with 2 threads on each core). I run a Spark job with 2 different spark-submit parameters.

spark-submit --master local[4]

spark-submit --master local --executor-cores 2

Is there really any difference between the two examples above? I am trying to get Spark to use 4 total threads for Spark "tasks", 2 threads on each physical core.

Answer 1

First of all --executor-cores argument or spark.executor.cores configuration option are not applicable in local mode. As a result:

--master local[4] starts Spark in the local mode using four worker threads.
--master local starts Spark in the local mode using one worker thread. --executor-core has no effect.

This accounts only for "data processing" threads. Overall number of threads used by Spark can be significantly larger.

Without going into OS and scheduling details the first option is the one you're looking for if you want to utilize four threads.

spark-submit: Difference between “ --master local[n]” and “--master local --executor-cores m”

Question

1 answers

solution1
2 ACCPTED 2016-10-09 02:28:40

spark-submit: Difference between “ --master local[n]” and “--master local --executor-cores m”

Question

1 answers

solution1 2 ACCPTED 2016-10-09 02:28:40

solution1
2 ACCPTED 2016-10-09 02:28:40