[英]When was automatic Spark RDD partition cache eviction implemented?
In the past Spark would OOM a lot Spark java.lang.OutOfMemoryError: Java heap space过去 Spark 会 OOM 很多Spark java.lang.OutOfMemoryError: Java heap space
I've noticed since more recent (for me recent is 1.6+ since I started with 0.7) versions of Spark don't throw OOM if the RDD cannot fit in memory.我已经注意到,如果 RDD 无法放入内存,那么最近(对我来说,自从我从 0.7 开始以来,最近是 1.6+)版本的 Spark 不会抛出 OOM。 Instead RDD partitions are evicted, and so they need to be recomputed.
相反,RDD 分区被逐出,因此需要重新计算它们。
I would like to know what version of Spark made this change?我想知道是哪个版本的 Spark 进行了此更改?
I've tried reading through a lot of https://spark.apache.org/releases/ but cannot find anything definitive.我尝试通读了很多https://spark.apache.org/releases/,但找不到任何明确的内容。
I'm pretty sure it was around 2.0, but can't find anything to prove it.我很确定它大约是 2.0,但找不到任何证据来证明它。
This Jira seems to imply that it was implemented along with unified memory management in 1.6 https://issues.apache.org/jira/browse/SPARK-14289这个 Jira 似乎暗示它是在 1.6 https://issues.apache.org/jira/browse/SPARK-14289 中与统一内存管理一起实现的
Borrowed storage memory may be evicted when memory pressure arises
当内存压力出现时,借用的存储内存可能会被逐出
From https://issues.apache.org/jira/secure/attachment/12765646/unified-memory-management-spark-10000.pdf , which is attached to https://issues.apache.org/jira/browse/SPARK-10000 which was implemented in 1.6 https://spark.apache.org/releases/spark-release-1-6-0.html来自https://issues.apache.org/jira/secure/attachment/12765646/unified-memory-management-spark-10000.pdf ,附在https://issues.apache.org/jira/browse/SPARK -10000在 1.6 https://spark.apache.org/releases/spark-release-1-6-0.html中实现
Found it in the end最后找到了
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.