简体   繁体   中英

Does reducing the number of executor-cores consume less executor-memory?

My Spark job failed with the YARN error Container killed by YARN for exceeding memory limits 10.0 GB of 10 GB physical memory used .

Intuitively, I decreased the number of cores from 5 to 1 and the job ran successfully.

I did not increase the executor-memory because 10g was the max for my YARN cluster.

I just wanted to confirm if my intuition. Does reducing the number of executor-cores consume less executor-memory ? If so, why?

spark.executor.cores = 5, spark.executor.memory=10G

This means an executor can run 5 tasks in parallel. That means 10 GB needs to be shared by 5 tasks.So effectively on an average - each task will have 2 GB available. If all the tasks consumes more than 2 GB, than overall JVM will end up consuming more than 10 GB and so YARN will kill the container.

spark.executor.cores = 1, spark.executor.memory=10G

This means an executor can run only 1 task. That means 10 GB is available to 1 task completely. So if the task uses more than 2 GB but less than 10 GB, it will work fine. That was the case in your Job and so it worked.

Yes, each executor uses an extra 7% of memoryOverhead.

This calculation will be created thinking that you have two nodes, so we have three executors in one node and two executors in the other node.

Memory per executor in the first node = 10GB/3 = 3,333GB
Counting off heap overhead = 7% of 3,333GB = 0,233GB. 

So, your executor-memory should be 3,333GB - 0,233GB = 3,1GB per node

You can read another explanation here: https://spoddutur.github.io/spark-notes/distribution_of_executors_cores_and_memory_for_spark_application.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM