简体   繁体   English

什么时候实现了自动 Spark RDD 分区缓存驱逐?

[英]When was automatic Spark RDD partition cache eviction implemented?

In the past Spark would OOM a lot Spark java.lang.OutOfMemoryError: Java heap space过去 Spark 会 OOM 很多Spark java.lang.OutOfMemoryError: Java heap space

I've noticed since more recent (for me recent is 1.6+ since I started with 0.7) versions of Spark don't throw OOM if the RDD cannot fit in memory.我已经注意到,如果 RDD 无法放入内存,那么最近(对我来说,自从我从 0.7 开始以来,最近是 1.6+)版本的 Spark 不会抛出 OOM。 Instead RDD partitions are evicted, and so they need to be recomputed.相反,RDD 分区被逐出,因此需要重新计算它们。

I would like to know what version of Spark made this change?我想知道是哪个版本的 Spark 进行了此更改?

I've tried reading through a lot of https://spark.apache.org/releases/ but cannot find anything definitive.我尝试通读了很多https://spark.apache.org/releases/,但找不到任何明确的内容。

I'm pretty sure it was around 2.0, but can't find anything to prove it.我很确定它大约是 2.0,但找不到任何证据来证明它。

This Jira seems to imply that it was implemented along with unified memory management in 1.6 https://issues.apache.org/jira/browse/SPARK-14289这个 Jira 似乎暗示它是在 1.6 https://issues.apache.org/jira/browse/SPARK-14289 中与统一内存管理一起实现的

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM