簡體   English   中英

Worker 無法連接到 Spark Apache 中的 master

[英]Worker failed to connect to master in Spark Apache

我正在使用獨立集群管理器部署 Spark Apache 應用程序。 我的架構使用 2 台 Windows 機器,其中一台作為主機,另一台作為從機(工作者)

主人:我在上面運行: \bin>spark-class org.apache.spark.deploy.master.Master這就是 Web UI 顯示的內容:

slave:我在其上運行: \bin>spark-class org.apache.spark.deploy.worker.Worker spark://192.*.*.186:7077這就是 Web UI 顯示的內容:

問題是工作節點無法連接到主節點並顯示以下錯誤:

17/09/26 16:05:17 INFO Worker: Connecting to master 192.*.*.186:7077...
17/09/26 16:05:22 WARN Worker: Failed to connect to master 192.*.*.186:7077
org.apache.spark.SparkException: Exception thrown in awaitResult:
    at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
    at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
    at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
    at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1$$anon$1.run(Worker.scala:241)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
  Caused by: java.io.IOException: Failed to connect to /192.*.*.186:7077
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:232)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
    ... 4 more
 Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection timed out: no further information: /192.*.*.186:7077
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:257)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:631)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
    ... 1 more

知道兩台機器都禁用了防火牆並且我測試了它們之間的連接(使用nmap)並且一切正常,這可能是什么情況! 但使用 telnet 我收到此錯誤: Connecting To 192.*.*.186...Could not open connection to the host, on port 23: Connect failed

你能告訴我你的spark-env.sh conf嗎? 這有助於查明您的問題。

我最初的想法是,你需要導出SPARK_MASTER_HOST=(master ip)而不是SPARK_MASTER_IPspark-env.sh文件。 您需要為主服務器和從服務器執行此操作。 同時為主站和從站導出SPARK_LOCAL_IP

您需要將環境路徑設置為SPARK_MASTER_HOST & SPARK_LOCAL_HOSTlocalhost

SPARK_LOCAL_IP 和 SPARK_MASTER_IP 現在已棄用。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM