簡體   English   中英

火花提交無法連接

[英]spark-submit unable to connect

運行命令后

spark-submit --class org.apache.spark.examples.SparkPi --proxy-user yarn --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g --executor-cores 1 --queue default ./examples/jars/spark-examples_2.11-2.3.0.jar 10000

我在輸出中得到了它,並繼續重試。 我要去哪里錯了? 我是否缺少某些配置?

我已經為yarn創建了一個新用戶並運行該用戶。

WARN  Utils:66 - Your hostname, ukaleem-HP-EliteBook-850-G3 resolves to a loopback address: 127.0.1.1; using 10.XX.XX.XX instead (on interface enp0s31f6)
2018-06-14 16:50:41 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
Warning: Local jar /home/yarn/Documents/Scala-Examples/./examples/jars/spark-examples_2.11-2.3.0.jar does not exist, skipping.
2018-06-14 16:50:42 INFO  RMProxy:98 - Connecting to ResourceManager at /0.0.0.0:8032
2018-06-14 16:50:44 INFO  Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

最后,它給出了例外

    Exception in thread "main" java.net.ConnectException: Call From ukaleem-HP-EliteBook-850-G3/127.0.1.1 to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.GeneratedConstructorAccessor4.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
    at org.apache.hadoop.ipc.Client.call(Client.java:1479)
    at org.apache.hadoop.ipc.Client.call(Client.java:1412)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy8.getClusterMetrics(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:206)
    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy9.getClusterMetrics(Unknown Source)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:487)
    at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:155)
    at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:155)
    at org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
    at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:59)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:154)
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1146)
    at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1518)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
    at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:179)
    at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:177)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:177)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
    at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
    at org.apache.hadoop.ipc.Client.call(Client.java:1451)
    ... 28 more
2018-06-14 17:10:53 INFO  ShutdownHookManager:54 - Shutdown hook called
2018-06-14 17:10:53 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-5bddb7f3-165f-451c-8ab4-bb7729f4237c

編輯:將配置文件添加到我的spark / conf目錄后,我現在得到此錯誤。

我添加的文件是

* core-site.xml

dfs.hosts

大師

奴隸

yarn-site.xml *

還有更多。 我了解的是,我只需要yarn-site.xml來告訴spark紗線簇的位置。 (ID,地址,主機名等)。

一直以來,我一直在想,即使我們想在Yarn上提交作業,這些配置也需要放在/ etc / Hadoop目錄而不是Spark / conf中。 然后安裝hadoop的目的是什么(除了通信之外)? 並跟隨這個問題。 如果配置需要進入spark / conf,則HADOOP_CONF_DIRYARN_CONF_DIR應該指向etc / hadoop dir或spark / conf?

    INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
18/06/19 11:04:50 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm2 after 1 fail over attempts. Trying to fail over after sleeping for 38176ms.
java.net.ConnectException: Call From ukaleem-HP-EliteBook-850-G3/127.0.1.1 to svc-hadoop-mgnt-pre-c2-01.jamba.net:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
    at org.apache.hadoop.ipc.Client.call(Client.java:1479)
    at org.apache.hadoop.ipc.Client.call(Client.java:1412)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy13.getClusterMetrics(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:206)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy14.getClusterMetrics(Unknown Source)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:487)
    at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:155)
    at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:155)
    at org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
    at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:59)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:154)
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1146)
    at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1518)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
    at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:179)
    at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:177)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:177)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
    at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
    at org.apache.hadoop.ipc.Client.call(Client.java:1451)
    ... 29 more

假設您有一個完全分布式的紗線簇:您的spark-submit腳本無法找到紗線資源管理器的配置(基本上是紗線主節點)。 確保在您的環境中正確設置了HADOOP_CONF_DIR ,並且它指向集群的配置。 特別是您的yarn-site.xml

編輯:更多細節

hadoop軟件包隨附服務器客戶端軟件。 服務器軟件將是組成集群的許多運行的守護程序。 如果您的工作站充當客戶端 (寬松地使用該術語,與sparks --deploy-mode不完全相關),則hadoop 客戶端軟件必須知道集群中運行的服務器守護程序的網絡位置。 如果您的yarn-site.xml為空,那么它將從yarn-defauls.xml (我相信它是硬編碼的)中提取其默認值。

假設您的集群未在高可用性模式下運行,並且是默認配置,那么您工作站的yarn-site.xml應該至少包含如下條目:

<property>
  <name>yarn.resourcemanager.hostname</name>
  <value>rm-host.yourdomain.com</value>
</property>

顯然,將主機名替換為實際資源管理器運行所在的主機名。 當然,與HDFS的任何火花交互都需要正確配置的hdfs-site.xml等。

某些集群管理軟件將具有“生成客戶端配置”之類的東西(特別是考慮到我的cloudera經驗),它將為您提供一個.tar.gz其中所有配置文件均已正確填充以從外部工作站訪問集群。

進一步的建議:如果您打算在此群集中對紗線進行大量火花處理,則Spark建議確保已將外部洗牌服務配置為與紗線節點管理器一起啟動。 (請記住,此配置指令必須存在於運行yarn的節點管理器服務的yarn-site.xml中,而不是在您的工作站上。

如果您是在本地計算機上運行此程序,

更新/etc/hosts文件,在主機名上輸入127.0.0.1。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM