[英]Making spark use /etc/hosts file for binding in YARN cluster mode
Have a spark cluster setup on a machine with two inets, one public another private. 在具有两个inet的计算机上进行火花群集设置,一个公共,另一个公共。 The /etc/hosts file in the cluster has the internal ip of all the other machines in the cluster, like so.
集群中的/ etc / hosts文件具有集群中所有其他计算机的内部ip,就像这样。
internal_ip FQDN
internal_ip FQDN
However when I request a SparkContext via pyspark in YARN client mode( pyspark --master yarn --deploy-mode client
), akka binds onto the public ip and thus a time out occurs. 但是,当我在YARN客户端模式(
pyspark --master yarn --deploy-mode client
)中通过pyspark请求SparkContext时,akka绑定到公共ip上,因此发生超时。
15/11/07 23:29:23 INFO Remoting: Starting remoting
15/11/07 23:29:23 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkYarnAM@public_ip:44015]
15/11/07 23:29:23 INFO util.Utils: Successfully started service 'sparkYarnAM' on port 44015.
15/11/07 23:29:23 INFO yarn.ApplicationMaster: Waiting for Spark driver to be reachable.
15/11/07 23:31:30 ERROR yarn.ApplicationMaster: Failed to connect to driver at yarn_driver_public_ip:48875, retrying ...
15/11/07 23:31:30 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Failed to connect to driver!
at org.apache.spark.deploy.yarn.ApplicationMaster.waitForSparkDriver(ApplicationMaster.scala:427)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:293)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:149)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:574)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:65)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:65)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:572)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:599)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
15/11/07 23:31:30 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: org.apache.spark.SparkException: Failed to connect to driver!)
15/11/07 23:31:30 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: org.apache.spark.SparkException: Failed to connect to driver!)
15/11/07 23:31:30 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1446960366742_0002
As seen from the log, private IP is completely ignored, how can I make YARN and spark use the private IP address as specified in the hosts file ? 从日志中可以看出,私有IP被完全忽略了,如何使YARN和spark使用hosts文件中指定的私有IP地址?
Cluster was provisioned using Ambari(HDP 2.4) 使用Ambari(HDP 2.4)设置群集
+1 for the question. +1问题。
Spark uses Akka for communication. Spark使用Akka进行通信。
So it's more of an Akka question than Spark. 因此,与其说是Spark,不如说是Akka问题。
If you need to bind your network interface to a different address - use akka.remote.netty.tcp.bind-hostname and akka.remote.netty.tcp.bind-port settings.
如果需要将网络接口绑定到其他地址,请使用akka.remote.netty.tcp.bind-hostname和akka.remote.netty.tcp.bind-port设置。
http://doc.akka.io/docs/akka/snapshot/additional/faq.html#Why_are_replies_not_received_from_a_remote_actor_ http://doc.akka.io/docs/akka/snapshot/additional/faq.html#Why_are_replies_not_received_from_a_remote_actor_
This is currently an issue in spark, the only way to get spark to bind to the proper interface is to use custom nameservers. 当前这是spark中的一个问题,使spark绑定到正确接口的唯一方法是使用自定义名称服务器。
Spark essentially does a hostname lookup and uses the IP address that it finds to bind with Akka. Spark本质上进行主机名查找,并使用它找到的IP地址与Akka绑定。 Workaround is to create a custom bind zone and run a nameserver.
解决方法是创建自定义绑定区域并运行名称服务器。
https://issues.apache.org/jira/browse/SPARK-5113 https://issues.apache.org/jira/browse/SPARK-5113
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.