簡體   English   中英

Spark流:java.lang.OutOfMemoryError:Java堆空間

[英]Spark Streaming: java.lang.OutOfMemoryError: Java heap space

我正在嘗試運行一個用python編寫的簡單火花流作業:

#!/usr/bin/env python
from pyspark import SparkContext, SparkConf
from pyspark.streaming import StreamingContext

conf = SparkConf()
conf.setMaster("spark://master1:7077,master2:7077")
sc = SparkContext(conf=conf)
ssc = StreamingContext(sc, 1)

ssc.socketTextStream("master1", 9999).count().pprint()

ssc.start()
ssc.awaitTermination()

運行幾秒鍾后,任務失敗。 這是我看到的異常:

java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3236)
    at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
    at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
    at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
    at com.esotericsoftware.kryo.io.Output.flush(Output.java:155)
    at com.esotericsoftware.kryo.io.Output.require(Output.java:135)
    at com.esotericsoftware.kryo.io.Output.writeString_slow(Output.java:420)
    at com.esotericsoftware.kryo.io.Output.writeString(Output.java:326)
    at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:153)
    at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:146)
    at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568)
    at org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:158)
    at org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:153)
    at org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1190)
    at org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:1199)
    at org.apache.spark.storage.MemoryStore.putArray(MemoryStore.scala:132)
    at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:169)
    at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:143)
    at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:791)
    at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638)
    at org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler.storeBlock(ReceivedBlockHandler.scala:77)
    at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl.pushAndReportBlock(ReceiverSupervisorImpl.scala:156)
    at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl.pushArrayBuffer(ReceiverSupervisorImpl.scala:127)
    at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl$$anon$3.onPushBlock(ReceiverSupervisorImpl.scala:108)
    at org.apache.spark.streaming.receiver.BlockGenerator.pushBlock(BlockGenerator.scala:294)
    at org.apache.spark.streaming.receiver.BlockGenerator.org$apache$spark$streaming$receiver$BlockGenerator$$keepPushingBlocks(BlockGenerator.scala:266)
    at org.apache.spark.streaming.receiver.BlockGenerator$$anon$1.run(BlockGenerator.scala:108)

此后將開始一個新任務,因此作業將繼續運行。 但是,我想知道,我想念的是什么。

更新

spark-defaults.conf

spark.serializer                 org.apache.spark.serializer.KryoSerializer
spark.driver.memory              4g
spark.executor.memory            4g
spark.executor.extraJavaOptions  -XX:+PrintGCDetails
spark.deploy.recoveryMode        ZOOKEEPER
spark.deploy.zookeeper.url       master1:2181,master2:2181,master3:2181

嘗試在應用程序本身上設置執行程序內存:

conf = SparkConf()
conf.setMaster("spark://master1:7077,master2:7077")
conf.set("spark.executor.memory", "4g")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM