[英]Heap Space error: SparkListenerBus
I am trying to debug a PySpark program and quite frankly, I am stumped. 我试图调试PySpark程序,坦率地说,我很困惑。
I see the following error in the logs. 我在日志中看到以下错误。 I verified the input parameters - all appear to be in order.
我验证了输入参数-所有参数似乎都井井有条。
Driver and executors appear to be proper - about 3MB of 7GB being used on each node. 驱动程序和执行程序似乎是正确的-每个节点上使用了大约3MB的7GB。 I see that the DAG plan that is created is huge.
我看到创建的DAG计划非常庞大。 Could it be due to that?
可能是由于这个原因吗?
18/02/17 00:59:02 ERROR Utils: throw uncaught fatal error in thread SparkListenerBus 18/02/17 00:59:02错误实用程序:在线程SparkListenerBus中引发未捕获的致命错误
java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError:Java堆空间
at java.util.Arrays.copyOfRange(Arrays.java:3664)
at java.lang.String.<init>(String.java:207)
at java.lang.StringBuilder.toString(StringBuilder.java:407)
at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:356)
at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.getText(ReaderBasedJsonParser.java:235)
at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:20)
at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:42)
at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:35)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3736)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2726)
at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:20)
at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:50)
at org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:103)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:134)
at org.apache.spark.scheduler.EventLoggingListener.onOtherEvent(EventLoggingListener.scala:202)
at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:67)
at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)
at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:63)
at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:36)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:94)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77)
Exception in thread "SparkListenerBus" java.lang.OutOfMemoryError: Java heap space 线程“ SparkListenerBus”中的异常java.lang.OutOfMemoryError:Java堆空间
at java.util.Arrays.copyOfRange(Arrays.java:3664)
at java.lang.String.<init>(String.java:207)
at java.lang.StringBuilder.toString(StringBuilder.java:407)
at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:356)
at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.getText(ReaderBasedJsonParser.java:235)
at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:20)
at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:42)
at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:35)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3736)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2726)
at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:20)
at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:50)
at org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:103)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:134)
at org.apache.spark.scheduler.EventLoggingListener.onOtherEvent(EventLoggingListener.scala:202)
at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:67)
at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)
at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:63)
at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:36)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:94)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)
at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77)
A workaround for this error is to use the setting: 解决此错误的方法是使用以下设置:
spark.eventLog.enabled=false
but as it would imply you do not get any event logs. 但它暗示您没有任何事件日志。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.