[英]Spark-submit issue loading classes
我正在使用HDP 2.6。 我下載了最新版本的Spark(2.2.1)並使用spark-submit
我試圖運行我的jar(使用相同版本的Spark作為程序集構建)。 但是,我收到錯誤:
Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
我的$HADOOP_CONF_DIR
是/etc/hadoop/conf
,鏈接到/usr/hdp/current/hadoop-client/conf
我在yarn.application.classpath
中的yarn-site.xml
包含條目: /usr/hdp/current/hadoop-yarn-client/*
這個目錄包含jar hadoop-yarn-common.jar
,它包含類org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider
。 這就是為什么不明白發生了什么。
我根據以下建議做了這個檢查: link 1 link 2 Bellow full stacktrace if usefull:
[root@omm101 bin]# pwd
/opt/spark_2.2.1/spark-2.2.1-bin-hadoop2.7/bin
[root@omm101 bin]# echo $HADOOP_CONF_DIR
/etc/hadoop/conf
[root@omm101 bin]# ./spark-submit --class net.atos.ooredooom.reportengine.trareport.TraReport --master yarn --deploy-mode client --num-executors 18 --executor-cores 5 --executor-memory 15g --driver-memory 1g /root/jars/report-compute-engine.jar
18/01/25 17:22:50 INFO spark.SparkContext: Running Spark version 2.2.1
18/01/25 17:22:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/01/25 17:22:51 INFO spark.SparkContext: Submitted application: TRA-Report
18/01/25 17:22:51 INFO spark.SecurityManager: Changing view acls to: root
18/01/25 17:22:51 INFO spark.SecurityManager: Changing modify acls to: root
18/01/25 17:22:51 INFO spark.SecurityManager: Changing view acls groups to:
18/01/25 17:22:51 INFO spark.SecurityManager: Changing modify acls groups to:
18/01/25 17:22:51 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
18/01/25 17:22:51 INFO util.Utils: Successfully started service 'sparkDriver' on port 41613.
18/01/25 17:22:51 INFO spark.SparkEnv: Registering MapOutputTracker
18/01/25 17:22:51 INFO spark.SparkEnv: Registering BlockManagerMaster
18/01/25 17:22:51 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/01/25 17:22:51 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/01/25 17:22:51 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-27408430-08bf-4252-b320-e68e6d103154
18/01/25 17:22:51 INFO memory.MemoryStore: MemoryStore started with capacity 366.3 MB
18/01/25 17:22:51 INFO spark.SparkEnv: Registering OutputCommitCoordinator
18/01/25 17:22:51 INFO util.log: Logging initialized @1743ms
18/01/25 17:22:51 INFO server.Server: jetty-9.3.z-SNAPSHOT
18/01/25 17:22:51 INFO server.Server: Started @1814ms
18/01/25 17:22:51 INFO server.AbstractConnector: Started ServerConnector@c835d12{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
18/01/25 17:22:51 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2a2bb0eb{/jobs,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@58783f6c{/jobs/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@512d92b{/jobs/job,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1bc53649{/jobs/job/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@47d93e0d{/stages,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@751e664e{/stages/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@182b435b{/stages/stage,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3704122f{/stages/stage/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@60afd40d{/stages/pool,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3f2049b6{/stages/pool/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@ea27e34{/storage,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@e72dba7{/storage/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1dfd5f51{/storage/rdd,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@24855019{/storage/rdd/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4d4d8fcf{/environment,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6f0628de{/environment/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1e392345{/executors,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4ced35ed{/executors/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7bd69e82{/executors/threadDump,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@51b01960{/executors/threadDump/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@27dc79f7{/static,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7674a051{/,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6754ef00{/api,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3301500b{/jobs/job/kill,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@15deb1dc{/stages/stage/kill,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.4.110.24:4040
18/01/25 17:22:51 INFO spark.SparkContext: Added JAR file:/root/jars/report-compute-engine.jar at spark://10.4.110.24:41613/jars/report-compute-engine.jar with timestamp 1516886571716
18/01/25 17:22:52 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
18/01/25 17:22:52 INFO impl.TimelineClientImpl: Timeline service address: http://omm103.in.nawras.com.om:8188/ws/v1/timeline/
18/01/25 17:22:52 INFO service.AbstractService: Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl failed in state STARTED; cause: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:161)
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:153)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2516)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:910)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:910)
at net.atos.ooredooom.reportengine.trareport.TraReport$.main(TraReport.scala:14)
at net.atos.ooredooom.reportengine.trareport.TraReport.main(TraReport.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
... 25 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
... 26 more
18/01/25 17:22:52 ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:161)
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:153)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2516)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:910)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:910)
at net.atos.ooredooom.reportengine.trareport.TraReport$.main(TraReport.scala:14)
at net.atos.ooredooom.reportengine.trareport.TraReport.main(TraReport.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
... 25 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
... 26 more
18/01/25 17:22:52 INFO server.AbstractConnector: Stopped Spark@c835d12{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
18/01/25 17:22:52 INFO ui.SparkUI: Stopped Spark web UI at http://10.4.110.24:4040
18/01/25 17:22:52 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
18/01/25 17:22:52 INFO cluster.YarnClientSchedulerBackend: Stopped
18/01/25 17:22:52 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/01/25 17:22:52 INFO memory.MemoryStore: MemoryStore cleared
18/01/25 17:22:52 INFO storage.BlockManager: BlockManager stopped
18/01/25 17:22:52 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
18/01/25 17:22:52 WARN metrics.MetricsSystem: Stopping a MetricsSystem that is not running
18/01/25 17:22:52 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/01/25 17:22:52 INFO spark.SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:161)
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:153)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2516)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:910)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:910)
at net.atos.ooredooom.reportengine.trareport.TraReport$.main(TraReport.scala:14)
at net.atos.ooredooom.reportengine.trareport.TraReport.main(TraReport.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
... 25 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
... 26 more
18/01/25 17:22:52 INFO util.ShutdownHookManager: Shutdown hook called
18/01/25 17:22:52 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-8ea65e11-e6d7-480c-874b-e35071fa6d7f
任何支持/提示將不勝感激。
編輯:由於缺乏更好的想法, spark-submit
我添加--jars /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-common-2.7.3.2.6.1.0-129.jar
和這cause: IllegalAccessError
:試圖訪問方法org.apache.hadoop.yarn.clientRMFailoverProxyProvider.getProxyInternal()
當這hadoop-yarn-common-2.7.3.2.6.1.0-129.jar
時我試着將hadoop-yarn-common-2.7.3.2.6.1.0-129.jar
放入.../<spark_dir>/jars
我得到了相同的結果。
所以基本上問題是為什么spark沒有使用hdp jar。 它應該使用它(因為當我強制使用這個lib時,我看到這個IllegalAccessError
)。 在Spark 2.2.1中有一個jar hadoop-yarn-common-2.7.3.jar
但是這個jar不包含RequestHedgingRMFailoverProxyProvider
(可能是HDP特定的?)
HDP默認配置使用org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider
進行實施。 然而,Spark沒有打包這個impl。 但另一個名為ConfiguredRMFailoverProxyProvider
。 它們都實現了接口RMFailoverProxyProvider
。 (請參閱此文檔: https : //hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/apidocs/org/apache/hadoop/yarn/client/RMFailoverProxyProvider.html )
所以為了解決這個問題,請按照此說明更改HDP上的配置。 https://community.hortonworks.com/content/supportkb/178800/errorclass-orgapachehadoopyarnclientrequesthedging.html
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.