简体   繁体   中英

spark.executor.instances over spark.dynamicAllocation.enabled = True

I'm working in a Spark project using MapR distribution where the dynamic allocation is enabled. Please refer to the below parameters :

spark.dynamicAllocation.enabled         true
spark.shuffle.service.enabled           true
spark.dynamicAllocation.minExecutors    0
spark.dynamicAllocation.maxExecutors    20
spark.executor.instances                2

As per my understanding spark.executor.instances is what we define as --num-executors while submitting our pySpark job.

I have following 2 questions :

  1. if I use --num-executors 5 during my job submission will it overwrite spark.executor.instances 2 config setting?

  2. what is the purpose of having spark.executor.instances defined when dynamic allocation min and max executors are already defined?

There is one more parameter which is

spark.dynamicAllocation.initialExecutors

it takes the value of spark.dynamicAllocation.minExecutors . If spark.executor.instances is defined and its larger than the minExecutors then it will take the value of the initial executors.

spark.executor.instances basically is the property for static allocation. However, if dynamic allocation is enabled, the initial set of executors will be at least equal to spark.executor.instances .

It wont get overwritten in the config setting, when you set --num-executors.

Extra read: official doc

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM