Spark sql throws java.lang.OutOfMemoryError in yarn cluster mode but works in yarn client mode

Question

I have a simple hive query which works fine in yarn client mode using pyspark shell where as it throws me the below error when i run it in yarn-cluster mode.

Exception in thread "Thread-6" 
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread-6"
Exception in thread "Reporter" 
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Reporter" 
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "sparkDriver-scheduler-1"

Cluster information: Hadoop 2.4, Spark 1.4.0-hadoop2.4 ,hive 0.13.1 The script takes 10 columns from a hive table and does some transformations and writes it to a file.

> num-executors 200 executor-memory 8G driver-memory 16G executor-cores 3

Full stack trace:

py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o62.javaToPython.
: java.lang.OutOfMemoryError: PermGen space at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2570)
    at java.lang.Class.getDeclaredMethods(Class.java:1855)
    at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:206)
    at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:132)
    at org.apache.spark.SparkContext.clean(SparkContext.scala:1891)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:683)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:682)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
    at org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:682)
    at org.apache.spark.api.python.SerDeUtil$.javaToPython(SerDeUtil.scala:140)
    at org.apache.spark.sql.DataFrame.javaToPython(DataFrame.scala:1435)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)

Answer 1

java.lang.OutOfMemoryError: PermGen space at java.lang.ClassLoader.defineClass1(...

You are likely running out of "permanent generation" heap space in the driver's JVM. This area is used to store classes. When we run in cluster mode, the JVM needs to load more classes (I think this is because the Application Manager runs inside the same JVM as the driver). To increase the PermGen area, add the following option:

--driver-java-options -XX:MaxPermSize=256M

See also https://plumbr.eu/outofmemoryerror/permgen-space

When using HiveContext in your Python program, I've found that the following option is also needed:

--files /usr/hdp/current/spark-client/conf/hive-site.xml

See also https://community.hortonworks.com/questions/27239/executing-spark-submit-with-yarn-cluster-mode-and.html

I've also wanted to specify a specific version of Python to use, which requires another option:

--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/usr/local/bin/python2.7

See also https://issues.apache.org/jira/browse/SPARK-9235

Answer 2

Little addition to Mark's answer - sometimes Spark with HiveContext complains about OutOfMemoryError without any mention of PermGen, however only -XX:MaxPermSize helps.

So if you dealing with OOM when Spark + HiveContext is used, also try -XX:MaxPermSize

Spark sql throws java.lang.OutOfMemoryError in yarn cluster mode but works in yarn client mode

Question

2 answers

solution1
1 2017-02-15 16:57:02

solution2
0 2017-03-23 15:03:09

Spark sql throws java.lang.OutOfMemoryError in yarn cluster mode but works in yarn client mode

Question

2 answers

solution1 1 2017-02-15 16:57:02

solution2 0 2017-03-23 15:03:09

solution1
1 2017-02-15 16:57:02

solution2
0 2017-03-23 15:03:09