简体   繁体   English

为什么集群模式下的YARN上的Spark会因“线程中的异常”驱动程序“java.lang.NullPointerException”而失败?

[英]Why does Spark on YARN in cluster mode fail with “Exception in thread ”Driver“ java.lang.NullPointerException”?

I'm using emr-5.4.0 with Spark 2.1.0. 我正在使用emr-5.4.0和Spark 2.1.0。 I understand what NullPointerException is, this question is about why that was thrown in this particular case. 我理解NullPointerException是什么,这个问题是关于为什么在这种特殊情况下抛出它。

Cannot really figure out why I got NullPointerException in the driver thread. 无法弄清楚为什么我在驱动程序线程中得到NullPointerException。

I got this weird job failing with this error: 我得到了这个奇怪的工作失败了这个错误:

18/03/29 20:07:52 INFO ApplicationMaster: Starting the user application in a separate Thread
18/03/29 20:07:52 INFO ApplicationMaster: Waiting for spark context initialization...
Exception in thread "Driver" java.lang.NullPointerException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
18/03/29 20:07:52 ERROR ApplicationMaster: Uncaught exception:
java.lang.IllegalStateException: SparkContext is null but app is still running!
    at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:415)
    at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:254)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:766)
    at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
    at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:764)
    at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
18/03/29 20:07:52 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.lang.IllegalStateException: SparkContext is null but app is still running!)
18/03/29 20:07:52 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: java.lang.IllegalStateException: SparkContext is null but app is still running!)
18/03/29 20:07:52 INFO ApplicationMaster: Deleting staging directory hdfs://<ip-address>.ec2.internal:8020/user/hadoop/.sparkStaging/application_1522348295743_0010
18/03/29 20:07:52 INFO ShutdownHookManager: Shutdown hook called
End of LogType:stderr

I submitted this job as this: 我提交了这份工作:

spark-submit --deploy-mode cluster --master yarn --num-executors 40 --executor-cores 16 --executor-memory 100g --driver-cores 8 --driver-memory 100g --class <package.class_name> --jars <s3://s3_path/some_lib.jar> <s3://s3_path/class.jar>

And my class looks like this: 我的班级看起来像这样:

class MyClass {

  def main(args: Array[String]): Unit = {
    val c = new MyClass()
    c.process()
  }

  def process(): Unit = {
    val sparkConf = new SparkConf().setAppName("my-test")
    val sparkSession: SparkSession = SparkSession.builder().config(sparkConf).getOrCreate()
    import sparkSession.implicits._
    ....
  }

  ...
}

Change class MyClass to object MyClass and you're done. class MyClass更改为object MyClass ,您就完成了。

While we're at it, I'd also change class MyClass to object MyClass extends App and remove def main(args: Array[String]): Unit (as given by extends App ). 当我们在它的时候,我也将class MyClass更改为object MyClass extends App并删除def main(args: Array[String]): Unit (由extends App给出)。

I've reported an improvement for Spark 2.3.0 - [SPARK-23830] Spark on YARN in cluster deploy mode fail with NullPointerException when a Spark application is a Scala class not object - to have it reported nicely to an end user. 我已经报告了Spark 2.3.0的改进 - [SPARK-23830]当集群部署模式下的YARN上的Spark在Spark应用程序是Scala类而不是对象时失败,并且NullPointerException - 让它很好地报告给最终用户。


Digging deeper into how Spark on YARN works, the following message is when the ApplicationMaster of a Spark application starts the driver (you used --deploy-mode cluster --master yarn with spark-submit ). 深入研究YARN上的Spark如何工作,以下消息是当Spark应用程序ApplicationMaster启动驱动程序时 (您使用了--deploy-mode cluster --master yarn with spark-submit )。

ApplicationMaster: Starting the user application in a separate Thread ApplicationMaster:在单独的线程中启动用户应用程序

Right after the INFO message you should see another: 在INFO消息之后,您应该看到另一个消息:

ApplicationMaster: Waiting for spark context initialization... ApplicationMaster:等待spark上下文初始化...

This is part of the driver initialization when the ApplicationMaster runs . 这是ApplicationMaster运行时驱动程序初始化的一部分。

The reason for the exception Exception in thread "Driver" java.lang.NullPointerException is due to the following code : Exception in thread "Driver" java.lang.NullPointerException异常Exception in thread "Driver" java.lang.NullPointerException的原因是由于以下代码

val mainMethod = userClassLoader.loadClass(args.userClass)
  .getMethod("main", classOf[Array[String]])

My understanding is that mainMethod is null at this point so the following line (where mainMethod is null ) "triggers" NullPointerException : 我的理解是mainMethodnull ,因此以下行 (其中mainMethodnull )“触发” NullPointerException

mainMethod.invoke(null, userArgs.toArray)

The thread is indeed called Driver (as in Exception in thread "Driver" java.lang.NullPointerException ) as set in this line : 该线程确实称为Driver (如在Exception in thread "Driver" java.lang.NullPointerException中的Exception in thread "Driver" java.lang.NullPointerException ),如下所示

userThread.setContextClassLoader(userClassLoader)
userThread.setName("Driver")
userThread.start()

The line numbers differ since I used Spark 2.3.0 to reference the lines while you use emr-5.4.0 with Spark 2.1.0. 行号不同,因为我使用Spark 2.3.0来引用行,而使用emr-5.4.0和Spark 2.1.0。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 线程“删除Spark本地目录”中的异常java.lang.NullPointerException - Exception in thread “delete Spark local dirs” java.lang.NullPointerException 为什么spark-shell会因“SymbolTable.exitingPhase ... java.lang.NullPointerException”而失败? - Why does spark-shell fail with “SymbolTable.exitingPhase…java.lang.NullPointerException”? 将Spark作业启动到独立集群时发生java.lang.NullPointerException - java.lang.NullPointerException when launch spark job to a standalone cluster 为什么Eclipse中的Spark应用程序失败并显示“ main” java.lang.NoClassDefFoundError”中的异常? - Why does a Spark application fail with “Exception in thread ”main“ java.lang.NoClassDefFoundError” in Eclipse? 为什么 Spark 应用程序失败并显示“线程 ”main“中的异常 java.lang.NoClassDefFoundError: …StringDeserializer”? - Why does Spark application fail with “Exception in thread ”main“ java.lang.NoClassDefFoundError: …StringDeserializer”? Spark sql在yarn cluster模式下抛出java.lang.OutOfMemoryError,但在yarn客户端模式下工作 - Spark sql throws java.lang.OutOfMemoryError in yarn cluster mode but works in yarn client mode 使用Spark的群集中的“ java.lang.NullPointerException” - “ java.lang.NullPointerException” in Clustering using Spark Spark Scala 上的 java.lang.NullPointerException 问题 - Problem with java.lang.NullPointerException on Spark Scala 有关故障转移过程如何在纱线群集模式下为Spark驱动程序(及其YARN容器)工作的资源/文档 - Resources/Documentation on how does the failover process work for the Spark Driver (and its YARN Container) in yarn-cluster mode 线程“主”中的异常 java.lang.NullPointerException com.databricks.dbutils_v1.DBUtilsHolder$$anon$1.invoke - Exception in thread “main” java.lang.NullPointerException com.databricks.dbutils_v1.DBUtilsHolder$$anon$1.invoke
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM