简体繁体 English

了解spark.yarn.executor.memoryOverhead

[英]Understanding spark.yarn.executor.memoryOverhead

原文 2018-07-09 00:58:06 0 1 apache-spark

When I am running a spark application on yarn, with driver and executor memory settings as --driver-memory 4G --executor-memory 2G 当我在yarn上运行spark应用程序时，驱动程序和执行程序的内存设置为--driver-memory 4G --executor-memory 2G

Then when I run the application, an exceptions throws complaining that Container killed by YARN for exceeding memory limits. 2.5 GB of 2.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead. 然后，当我运行该应用程序时，抛出异常，抱怨说Container killed by YARN for exceeding memory limits. 2.5 GB of 2.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead. Container killed by YARN for exceeding memory limits. 2.5 GB of 2.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

What does this 2.5 GB mean here? 这2.5 GB在这里意味着什么？ (overhead memory, executor memory or overhead+executor memory?)I ask so because when I change the the memory settings as: （开销内存，执行程序内存还是开销+执行程序内存？）之所以这么问是因为当我将内存设置更改为：

--driver-memory 4G --executor-memory 4G --conf --driver-memory 4G --conf spark.yarn.executor.memoryOverhead=2048 ,then the exception disappears. --driver-memory 4G --executor-memory 4G --conf --driver-memory 4G --conf spark.yarn.executor.memoryOverhead=2048 ，然后异常消失。

I would ask, although I have boosted the overhead memory to 2G, it is still under 2.5G, why does it work now? 我会问，尽管我将开销内存提高到了2G，但仍低于2.5G，为什么现在可以使用？

1 个解决方案

Let us understand how memory is divided among various regions in spark. 让我们了解一下如何在Spark中将内存划分为各个区域。

Executor MemoryOverhead : 执行程序MemoryOverhead：

spark.yarn.executor.memoryOverhead = max(384 MB, .07 * spark.executor.memory) . spark.yarn.executor.memoryOverhead = max(384 MB, .07 * spark.executor.memory) 。 In your first case, memoryOverhead = max(384 MB, 0.07 * 2 GB) = max(384 MB, 143.36 MB) Hence, memoryOverhead = 384 MB is reserved in each executer assuming you have assigned single core per executer. 在您的第一种情况下， memoryOverhead = max(384 MB, 0.07 * 2 GB) = max(384 MB, 143.36 MB)因此，假设您为每个执行者分配了一个单核，则在每个执行者中都保留memoryOverhead = 384 MB 。

Execution and Storage Memory : 执行和存储内存：

By default spark.memory.fraction = 0.6 , which implies that execution and storage as a unified region occupy 60% of the remaining memory ie 998 MB . 默认情况下， spark.memory.fraction = 0.6 ，这意味着作为统一区域执行和存储占剩余内存的60％，即998 MB 。 There is no strict boundary that is allocated to each region unless you enable spark.memory.useLegacyMode . 除非您启用spark.memory.useLegacyMode否则不会为每个区域分配严格的边界。 Otherwise they share a moving boundary. 否则，它们共享移动边界。

User Memory : 用户内存：

Memory pool that remains after the allocation of Execution and Storage Memory, and it is completely up to you to use it in a way you like. 分配执行和存储内存后仍保留的内存池，完全取决于您以自己喜欢的方式使用它。 You can store your own data structures there that would be used in RDD transformations. 您可以在那里存储自己的数据结构，以用于RDD转换。 For example, you can rewrite Spark aggregation by using mapPartitions transformation maintaining hash table for this aggregation to run. 例如，您可以通过使用mapPartitions转换来维护此哈希表的哈希表来重写Spark聚合。 This comprises the rest of 40% memory left after MemoryOverhead. 这包括MemoryOverhead之后剩余的40％内存。 In your case it is ~660 MB . 您的情况是~660 MB 。

If any of the above allocations are not met by your job, then it is highly likely to end up in OOM problems. 如果您的工作没有满足上述任何分配条件，那么很有可能最终导致OOM问题。