[英]ApplicationMaster not able to find Spark Driver despite binding address set (Cluster Mode Yarn)
I have 3 nodes cluster that through the UIs shows that everything is well connected.我有 3 个节点集群,通过 UI 显示一切都连接良好。 Now if i do submit a Spark application with deployment mode being cluster then i get :
java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed
.现在,如果我确实提交了部署模式为集群的 Spark 应用程序,那么我会得到:
java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed
。 Full error in log (log of one of the Slaves) (when the application is launched on the current node then it runs well.日志中的完整错误(从站之一的日志)(当应用程序在当前节点上启动时,它运行良好。
The Spark session is defined like the following : Spark 会话定义如下:
SparkSession spark = SparkSession.builder().enableHiveSupport().appName("sparkApp")
.master("yarn").config("spark.driver.host","VM2").getOrCreate();
2021-11-09 17:59:52,149 ERROR yarn.ApplicationMaster: Uncaught exception: org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301) at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:504) at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:268) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:899) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:898) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:898) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) Caused by: java.net.BindException: Cannot assign requested addre
2021-11-09 17:59:52,149 错误 yarn.ApplicationMaster:未捕获的异常:org.apache.spark.SparkException:awaitResult 中抛出的异常:在 org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301 ) 在 org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:504) 在 org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:268) 在 org.apache.spark。 deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:899) 在 org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:898) 在 java.security.AccessController。 doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.spark.deploy。 yarn.ApplicationMaster$.main(ApplicationMaster.scala:898) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) 引起:java.net.BindException:无法分配请求的地址ss: bind: Service 'sparkDriver' failed after 16 retries (on a random free port)!
ss: bind: 服务 'sparkDriver' 重试 16 次后失败(在随机空闲端口上)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
考虑将服务“sparkDriver”的适当绑定地址(例如 SparkDriver 的 spark.driver.bindAddress)显式设置为正确的绑定地址。 at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:438) at sun.nio.ch.Net.bind(Net.java:430) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:225) at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:134) at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:550) at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1334) at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:506) at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:491) at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:973) at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:248) at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:356) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent.
在 sun.nio.ch.Net.bind0(Native Method) 在 sun.nio.ch.Net.bind(Net.java:438) 在 sun.nio.ch.Net.bind(Net.java:430) 在 sun .nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:225) 在 io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:134) 在 io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel) .java:550) 在 io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1334) 在 io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:506) 在 io.netty.channel.AbstractChannelHandlerContext。 bind(AbstractChannelHandlerContext.java:491) at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:973) at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:248) at io.netty.bootstrap.AbstractBootstrap $2.run(AbstractBootstrap.java:356) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent。 SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748)
SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:98) io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 在 io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) 在 java.lang.Thread.run(Thread.java) :748)
Also through this Yarn interface也通过这个 Yarn 接口
, if the first attempt of the application execution is done on VM2 (the current node) then it runs, otherwise it does not (except if the second attempt is on VM2)
, 如果应用程序执行的第一次尝试是在 VM2(当前节点)上完成,则它会运行,否则不会(除非第二次尝试是在 VM2 上)
I think you should alter your code:我认为你应该改变你的代码:
SparkSession spark = SparkSession.builder().enableHiveSupport().appName("sparkApp").master("yarn").getOrCreate();
You are using YARN (NOT standalone) you do not need to specify the driver.您使用的是 YARN(非独立),您无需指定驱动程序。 Yarn does the assignment for you.
Yarn 为您完成任务。
The Documentation does say:文档确实说:
spark.driver.host: Hostname or IP address for the driver. spark.driver.host:驱动程序的主机名或 IP 地址。 This is used for communicating with the executors and the standalone Master .
这用于与 executors 和独立的 Master通信。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.