简体   繁体   中英

How SPARK_WORKER_CORES setting impacts concurrency in Spark Standalone

I am using a Spark 2.2.0 cluster configured in Standalone mode. Cluster has 2 octa core machines. This cluster is exclusively for Spark jobs and no other process uses them. I have around 8 Spark Streaming apps which run on this cluster.
I explicitly set SPARK_WORKER_CORES (in spark-env.sh) to 8 and allocate one core to each app using total-executor-cores setting. This config reduces the capability to work in parallel on multiple tasks. If a stage works on a partitioned RDD with 200 partitions, only one task executes at a time. What I wanted Spark to do was to start separate thread for each job and process in parallel. But I couldn't find a separate Spark setting to control the number of threads.
So, I decided to play around and bloated the number of cores (ie SPARK_WORKER_CORES in spark-env.sh) to 1000 on each machine. Then I gave 100 cores to each Spark application. I found that spark started processing 100 partitons in parallel this time indicating that 100 threads were being used.
I am not sure if this is the correct method of impacting the number of threads used by a Spark job.

You mixed up two things:

  • Cluster manger properties - SPARK_WORKER_CORES - total number of cores that worker can offer. Use it to control a fraction of resources that should be used by Spark in total
  • Application properties --total-executor-cores / spark.cores.max - number of cores that application requests from the cluster manager. Use it control in-app parallelism.

Only the second on is directly responsible for app parallelism as long as, the first one is not limiting.

Also CORE in Spark is a synonym of thread. If you:

allocate one core to each app using total-executor-cores setting.

then you specifically assign a single data processing thread.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM