[英]Making spark use /etc/hosts file for binding in YARN cluster mode
在具有兩個inet的計算機上進行火花群集設置,一個公共,另一個公共。 集群中的/ etc / hosts文件具有集群中所有其他計算機的內部ip,就像這樣。
internal_ip FQDN
但是,當我在YARN客戶端模式( pyspark --master yarn --deploy-mode client
)中通過pyspark請求SparkContext時,akka綁定到公共ip上,因此發生超時。
15/11/07 23:29:23 INFO Remoting: Starting remoting
15/11/07 23:29:23 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkYarnAM@public_ip:44015]
15/11/07 23:29:23 INFO util.Utils: Successfully started service 'sparkYarnAM' on port 44015.
15/11/07 23:29:23 INFO yarn.ApplicationMaster: Waiting for Spark driver to be reachable.
15/11/07 23:31:30 ERROR yarn.ApplicationMaster: Failed to connect to driver at yarn_driver_public_ip:48875, retrying ...
15/11/07 23:31:30 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Failed to connect to driver!
at org.apache.spark.deploy.yarn.ApplicationMaster.waitForSparkDriver(ApplicationMaster.scala:427)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:293)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:149)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:574)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:65)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:65)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:572)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:599)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
15/11/07 23:31:30 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: org.apache.spark.SparkException: Failed to connect to driver!)
15/11/07 23:31:30 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: org.apache.spark.SparkException: Failed to connect to driver!)
15/11/07 23:31:30 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1446960366742_0002
從日志中可以看出,私有IP被完全忽略了,如何使YARN和spark使用hosts文件中指定的私有IP地址?
使用Ambari(HDP 2.4)設置群集
+1問題。
Spark使用Akka進行通信。
因此,與其說是Spark,不如說是Akka問題。
如果需要將網絡接口綁定到其他地址,請使用akka.remote.netty.tcp.bind-hostname和akka.remote.netty.tcp.bind-port設置。
當前這是spark中的一個問題,使spark綁定到正確接口的唯一方法是使用自定義名稱服務器。
Spark本質上進行主機名查找,並使用它找到的IP地址與Akka綁定。 解決方法是創建自定義綁定區域並運行名稱服務器。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.