简体   繁体   中英

How spark manages IO perfomnce if we reduce the number of cores per executor and incease number of executors

As per my research whenever we run the spark job we should not run the executors with more than 5 cores, if we increase the cores beyond the limit job will suffer due to bad I/O throughput.

my doubt is if we increase the number of executors and reduce the cores, even then these executors will be ending up in the same physical machine and those executors will be reading from the same disk and writing to the same disk, why will this not cause I/O throughput issue.

can consider Apache Spark: The number of cores vs. the number of executors

use case for reference.

The core within the executor are like threads. So just like how more work is done if we increase parallelism, we should always keep in mind that there is a limit to it. Because we have to gather the results from those parallel tasks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM