简体繁体中英

Understanding spark.yarn.executor.memoryOverhead

原文 2018-07-09 00:58:06 6 1 apache-spark

When I am running a spark application on yarn, with driver and executor memory settings as --driver-memory 4G --executor-memory 2G

Then when I run the application, an exceptions throws complaining that Container killed by YARN for exceeding memory limits. 2.5 GB of 2.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead. Container killed by YARN for exceeding memory limits. 2.5 GB of 2.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

What does this 2.5 GB mean here? (overhead memory, executor memory or overhead+executor memory?)I ask so because when I change the the memory settings as:

--driver-memory 4G --executor-memory 4G --conf --driver-memory 4G --conf spark.yarn.executor.memoryOverhead=2048 ,then the exception disappears.

I would ask, although I have boosted the overhead memory to 2G, it is still under 2.5G, why does it work now?

1 answers

Let us understand how memory is divided among various regions in spark.

Executor MemoryOverhead :

spark.yarn.executor.memoryOverhead = max(384 MB, .07 * spark.executor.memory) . In your first case, memoryOverhead = max(384 MB, 0.07 * 2 GB) = max(384 MB, 143.36 MB) Hence, memoryOverhead = 384 MB is reserved in each executer assuming you have assigned single core per executer.

Execution and Storage Memory :

By default spark.memory.fraction = 0.6 , which implies that execution and storage as a unified region occupy 60% of the remaining memory ie 998 MB . There is no strict boundary that is allocated to each region unless you enable spark.memory.useLegacyMode . Otherwise they share a moving boundary.

User Memory :

Memory pool that remains after the allocation of Execution and Storage Memory, and it is completely up to you to use it in a way you like. You can store your own data structures there that would be used in RDD transformations. For example, you can rewrite Spark aggregation by using mapPartitions transformation maintaining hash table for this aggregation to run. This comprises the rest of 40% memory left after MemoryOverhead. In your case it is ~660 MB .

If any of the above allocations are not met by your job, then it is highly likely to end up in OOM problems.

The value of “spark.yarn.executor.memoryOverhead” setting?

Boosting spark.yarn.executor.memoryOverhead

Why increase spark.yarn.executor.memoryOverhead?

Where to set “spark.yarn.executor.memoryOverhead”

AWS Glue - can't set spark.yarn.executor.memoryOverhead

is it good to always give parameter spark.yarn.executor.memoryOverhead?

the spark.yarn.driver.memoryOverhead or spark.yarn.executor.memoryOverhead is used to store what kind of data?

Spark: How to set spark.yarn.executor.memoryOverhead property in spark-submit

Difference between “spark.yarn.executor.memoryOverhead” and “spark.memory.offHeap.size”

Container killed by YARN for exceeding memory limits. 52.6 GB of 50 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Tags

Understanding spark.yarn.executor.memoryOverhead

Question

1 answers

solution1 4 2018-07-09 04:56:43

solution1
4 2018-07-09 04:56:43