简体   繁体   中英

When was automatic Spark RDD partition cache eviction implemented?

In the past Spark would OOM a lot Spark java.lang.OutOfMemoryError: Java heap space

I've noticed since more recent (for me recent is 1.6+ since I started with 0.7) versions of Spark don't throw OOM if the RDD cannot fit in memory. Instead RDD partitions are evicted, and so they need to be recomputed.

I would like to know what version of Spark made this change?

I've tried reading through a lot of https://spark.apache.org/releases/ but cannot find anything definitive.

I'm pretty sure it was around 2.0, but can't find anything to prove it.

This Jira seems to imply that it was implemented along with unified memory management in 1.6 https://issues.apache.org/jira/browse/SPARK-14289

Borrowed storage memory may be evicted when memory pressure arises

From https://issues.apache.org/jira/secure/attachment/12765646/unified-memory-management-spark-10000.pdf , which is attached to https://issues.apache.org/jira/browse/SPARK-10000 which was implemented in 1.6 https://spark.apache.org/releases/spark-release-1-6-0.html

Found it in the end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM