java.lang.OutOfMemoryError：无法获取100个字节的内存，得到0

Question

I'm invoking Pyspark with Spark 2.0 in local mode with the following command: 我使用以下命令在本地模式下使用Spark 2.0调用Pyspark：

pyspark --executor-memory 4g --driver-memory 4g

The input dataframe is being read from a tsv file and has 580 K x 28 columns. 输入数据帧正在从tsv文件中读取，并具有580 K x 28列。 I'm doing a few operation on the dataframe and then i am trying to export it to a tsv file and i am getting this error. 我正在对数据帧进行一些操作，然后我尝试将其导出到tsv文件，我收到此错误。

df.coalesce(1).write.save("sample.tsv",format = "csv",header = 'true', delimiter = '\t')

Any pointers how to get rid of this error. 任何指针如何摆脱这个错误。 I can easily display the df or count the rows. 我可以轻松显示df或计算行数。

The output dataframe is 3100 rows with 23 columns 输出数据帧为3100行，共23列

Error: 错误：

Job aborted due to stage failure: Task 0 in stage 70.0 failed 1 times, most recent failure: Lost task 0.0 in stage 70.0 (TID 1073, localhost): org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:261)
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
    at org.apache.spark.scheduler.Task.run(Task.scala:85)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Unable to acquire 100 bytes of memory, got 0
    at org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:129)
    at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPageIfNecessary(UnsafeExternalSorter.java:374)
    at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:396)
    at org.apache.spark.sql.execution.UnsafeExternalRowSorter.insertRow(UnsafeExternalRowSorter.java:94)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown Source)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
    at org.apache.spark.sql.execution.WindowExec$$anonfun$15$$anon$1.fetchNextRow(WindowExec.scala:300)
    at org.apache.spark.sql.execution.WindowExec$$anonfun$15$$anon$1.<init>(WindowExec.scala:309)
    at org.apache.spark.sql.execution.WindowExec$$anonfun$15.apply(WindowExec.scala:289)
    at org.apache.spark.sql.execution.WindowExec$$anonfun$15.apply(WindowExec.scala:288)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:766)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:766)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:89)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:89)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:89)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.rdd.CoalescedRDD$$anonfun$compute$1.apply(CoalescedRDD.scala:96)
    at org.apache.spark.rdd.CoalescedRDD$$anonfun$compute$1.apply(CoalescedRDD.scala:95)
    at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
    at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply$mcV$sp(WriterContainer.scala:253)
    at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:252)
    at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:252)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1325)
    at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:258)
    ... 8 more

Driver stacktrace:

Answer 1

I believe that the cause of this problem is coalesce() , which despite the fact that it avoids a full shuffle (like repartition would do ), it has to shrink the data in the requested number of partitions. 我相信这个问题的原因是coalesce（），尽管事实上它避免了一个完整的shuffle（比如重新分区会这样做），但它必须缩小所请求数量的分区中的数据。

Here, you are requesting all the data to fit into one partition, thus one task (and only one task) has to work with all the data , which may cause its container to suffer from memory limitations. 在这里，您要求所有数据适合一个分区，因此一个任务（并且只有一个任务）必须处理所有数据 ，这可能导致其容器受到内存限制。

So, either ask for more partitions than 1, or avoid coalesce() in this case. 因此，要么提出比1更多的分区，要么在这种情况下避免使用coalesce() 。

Otherwise, you could try the solutions provided in the links below, for increasing your memory configurations: 否则，您可以尝试以下链接中提供的解决方案，以增加内存配置：

Answer 2

The problem for me was indeed coalesce() . 我的问题确实是coalesce() 。 What I did was exporting the file not using coalesce() but parquet instead using df.write.parquet("testP") . 我所做的是导出文件不是使用coalesce()而是使用df.write.parquet("testP")而是镶木地板。 Then read back the file and export that with coalesce(1) . 然后回读文件并使用coalesce(1)导出该文件。

Hopefully it works for you as well. 希望它也适合你。

Answer 3

在我的情况下，用repartition(1)替换coalesce(1) repartition(1)工作。

Answer 4

As was stated in other answers, use repartition(1) instead of coalesce(1) . 如其他答案中所述，使用repartition(1)而不是coalesce(1) 。 The reason is that repartition(1) will ensure that upstream processing is done in parallel (multiple tasks/partitions), rather than on only one executor. 原因是重新分区（1）将确保上游处理并行（多个任务/分区），而不是仅在一个执行器上完成。

To quote the Dataset.coalesce() Spark docs: 引用Dataset.coalesce（） Spark文档：

However, if you're doing a drastic coalesce, eg to numPartitions = 1, this may result in your computation taking place on fewer nodes than you like (eg one node in the case of numPartitions = 1). 但是，如果您正在进行激烈的合并，例如numPartitions = 1，则可能导致您的计算发生在比您喜欢的节点更少的节点上（例如，numPartitions = 1时的一个节点）。 To avoid this, you can call repartition(1) instead. 为避免这种情况，您可以调用重新分区（1）。 This will add a shuffle step, but means the current upstream partitions will be executed in parallel (per whatever the current partitioning is). 这将添加一个shuffle步骤，但意味着当前的上游分区将并行执行（无论当前分区是什么）。

Answer 5

In my case the driver was smaller than the workers. 在我的情况下，司机比工人小。 Issue was resolved by making the driver larger. 通过使驱动程序更大来解决问题。

java.lang.OutOfMemoryError：无法获取100个字节的内存，得到0

问题描述

5 个解决方案

解决方案1
12 2016-08-15 19:34:50

解决方案2
10 已采纳 2016-08-23 16:43:24

解决方案3
6 2018-08-14 05:36:39

解决方案4
2 2019-05-22 18:55:53

解决方案5
0 2017-01-16 17:36:21

java.lang.OutOfMemoryError：无法获取100个字节的内存，得到0

问题描述

5 个解决方案

解决方案1 12 2016-08-15 19:34:50

解决方案2 10 已采纳 2016-08-23 16:43:24

解决方案3 6 2018-08-14 05:36:39

解决方案4 2 2019-05-22 18:55:53

解决方案5 0 2017-01-16 17:36:21

解决方案1
12 2016-08-15 19:34:50

解决方案2
10 已采纳 2016-08-23 16:43:24

解决方案3
6 2018-08-14 05:36:39

解决方案4
2 2019-05-22 18:55:53

解决方案5
0 2017-01-16 17:36:21