简体繁体中英

When was automatic Spark RDD partition cache eviction implemented?

原文 2021-10-16 16:57:43 3 1 scala/ apache-spark

In the past Spark would OOM a lot Spark java.lang.OutOfMemoryError: Java heap space

I've noticed since more recent (for me recent is 1.6+ since I started with 0.7) versions of Spark don't throw OOM if the RDD cannot fit in memory. Instead RDD partitions are evicted, and so they need to be recomputed.

I would like to know what version of Spark made this change?

I've tried reading through a lot of https://spark.apache.org/releases/ but cannot find anything definitive.

I'm pretty sure it was around 2.0, but can't find anything to prove it.

This Jira seems to imply that it was implemented along with unified memory management in 1.6 https://issues.apache.org/jira/browse/SPARK-14289

1 answers

Borrowed storage memory may be evicted when memory pressure arises

From https://issues.apache.org/jira/secure/attachment/12765646/unified-memory-management-spark-10000.pdf , which is attached to https://issues.apache.org/jira/browse/SPARK-10000 which was implemented in 1.6 https://spark.apache.org/releases/spark-release-1-6-0.html

Found it in the end

uniformly partition a rdd in spark

How to partition RDD by key in Spark?

Spark non-cached RDD Eviction from memory

Spark Error: Not enough space to cache partition rdd_8_2 in memory! Free memory is 58905314 bytes

Spark RDD equivalent to Scala collections partition

How to print elements of particular RDD partition in Spark?

spark: how to zip an RDD with each partition of the other RDD

Partition RDD in Apache Spark such that one partition consists on one file

When create two different Spark Pair RDD with same key set, will Spark distribute partition with same key to the same machine?

Spark : DB connection per Spark RDD partition and do mapPartition

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question uniformly partition a rdd in spark How to partition RDD by key in Spark? Spark non-cached RDD Eviction from memory Spark Error: Not enough space to cache partition rdd_8_2 in memory! Free memory is 58905314 bytes Spark RDD equivalent to Scala collections partition How to print elements of particular RDD partition in Spark? spark: how to zip an RDD with each partition of the other RDD Partition RDD in Apache Spark such that one partition consists on one file When create two different Spark Pair RDD with same key set, will Spark distribute partition with same key to the same machine? Spark : DB connection per Spark RDD partition and do mapPartition

Related Tags

When was automatic Spark RDD partition cache eviction implemented?

Question

1 answers

solution1 2 2021-10-16 17:17:51

solution1
2 2021-10-16 17:17:51