简体   繁体   中英

Why do we need more executors than number of machines in Spark?

What's the logic behind requesting more executors than machines available in your cluster?

In the ideal situation, we would like to have 1 executor (=1 jvm) at each of our machines, and not few in each machine.
If not, then why?

Thanks in advance

In the ideal situation, we would like to have 1 executor (=1 jvm) at each of our machines, and not few in each machine.

Not necessarily. Depending on the amount of available memory and JVM implementation separate virtual machines can be much a better option, in particular to:

  • Improve memory management with large machines - see for example Why 35GB Heap is Less Than 32GB – Java JVM Memory Oddities .
  • To improve fault tolerance with unstable workloads - if one JVM fails you'll lose work for all corresponding threads, so keeping things smaller can keep things under control.
  • To minimize effort required for GC tuning - very large instances can be extremely painful to tune.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM