简体   繁体   中英

ApplicationMaster not able to find Spark Driver despite binding address set (Cluster Mode Yarn)

I have 3 nodes cluster that through the UIs shows that everything is well connected. Now if i do submit a Spark application with deployment mode being cluster then i get : java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed . Full error in log (log of one of the Slaves) (when the application is launched on the current node then it runs well.
The Spark session is defined like the following :

 SparkSession spark = SparkSession.builder().enableHiveSupport().appName("sparkApp")
.master("yarn").config("spark.driver.host","VM2").getOrCreate();

2021-11-09 17:59:52,149 ERROR yarn.ApplicationMaster: Uncaught exception: org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301) at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:504) at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:268) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:899) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:898) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:898) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) Caused by: java.net.BindException: Cannot assign requested addre ss: bind: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address. at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:438) at sun.nio.ch.Net.bind(Net.java:430) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:225) at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:134) at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:550) at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1334) at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:506) at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:491) at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:973) at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:248) at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:356) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent. SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748)

Also through this Yarn interface用户界面 , if the first attempt of the application execution is done on VM2 (the current node) then it runs, otherwise it does not (except if the second attempt is on VM2)

I think you should alter your code:

SparkSession spark = SparkSession.builder().enableHiveSupport().appName("sparkApp").master("yarn").getOrCreate();

You are using YARN (NOT standalone) you do not need to specify the driver. Yarn does the assignment for you.

The Documentation does say:

spark.driver.host: Hostname or IP address for the driver. This is used for communicating with the executors and the standalone Master .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM