简体   繁体   中英

flink task manager could not register at job manager

I'm trying to create simple multi node flink cluster (1 master 1 slave). When I start my cluster using "./bin/start-cluster.sh", both job manager and task manager are started, but the task manager is not able to register at the job manager. After few minutes of trying, the task manager dies.

Details about the environment:

  • I'm working with Google cloud VMs. OS is Ubuntu x86_64
  • tried with flink versions flink-1.7.2 and flink-1.8.0. Both gave the same error.

job manager hostname = ubuntu-test-1 (10.142.0.40)
task manager hostname = ubuntu-test-2 (10.142.15.250)

$ cat conf/flink-conf.yaml:
env.java.home: /opt/sample/include/jdk
jobmanager.rpc.address: 10.142.0.40
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.heap.size: 1024m
taskmanager.numberOfTaskSlots: 1
parallelism.default: 1
rest.port: 8081

$cat conf/masters    
10.142.0.40:8081

$ cat conf/slaves 
10.142.15.250

Below is the complete log from task manager:

2019-06-25 05:44:36,335 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - --------------------------------------------------------------------------------
2019-06-25 05:44:36,336 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -  Starting TaskManager (Version: 1.7.2, Rev:ceba8af, Date:11.02.2019 @ 14:17:09 UTC)
2019-06-25 05:44:36,337 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -  OS current user: sample
2019-06-25 05:44:36,337 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -  Current Hadoop/Kerberos user: <no hadoop dependency found>
2019-06-25 05:44:36,337 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.121-b13
2019-06-25 05:44:36,337 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -  Maximum heap size: 922 MiBytes
2019-06-25 05:44:36,337 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -  JAVA_HOME: (not set)
2019-06-25 05:44:36,337 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -  No Hadoop Dependency available
2019-06-25 05:44:36,337 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -  JVM Options:
2019-06-25 05:44:36,337 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -     -XX:+UseG1GC
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -     -Xms922M
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -     -Xmx922M
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -     -XX:MaxDirectMemorySize=8388607T
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -     -Dlog.file=/var/tmp/flink-1.7.2/log/flink-sample-taskexecutor-0-ubuntu-test-2.log
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -     -Dlog4j.configuration=file:/var/tmp/flink-1.7.2/conf/log4j.properties
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -     -Dlogback.configurationFile=file:/var/tmp/flink-1.7.2/conf/logback.xml
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -  Program Arguments:
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -     --configDir
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -     /var/tmp/flink-1.7.2/conf
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       -  Classpath: /var/tmp/flink-1.7.2/lib/flink-python_2.11-1.7.2.jar:/var/tmp/flink-1.7.2/lib/log4j-1.2.17.jar:/var/tmp/flink-1.7.2/lib/slf4j-log4j12-1.7.15.jar:/var/tmp/flink-1.7.2/lib/flink-dist_2.11-1.7.2.jar:::
2019-06-25 05:44:36,338 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - --------------------------------------------------------------------------------
2019-06-25 05:44:36,339 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - Registered UNIX signal handlers for [TERM, HUP, INT]
2019-06-25 05:44:36,343 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - Maximum number of open file descriptors is 100000.
2019-06-25 05:44:36,352 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: env.java.home, /opt/sample/include/jdk
2019-06-25 05:44:36,353 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, 10.142.0.40
2019-06-25 05:44:36,353 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2019-06-25 05:44:36,353 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2019-06-25 05:44:36,353 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.heap.size, 1024m
2019-06-25 05:44:36,353 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2019-06-25 05:44:36,353 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1
2019-06-25 05:44:36,354 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: rest.port, 8081
2019-06-25 05:44:36,360 INFO  org.apache.flink.core.fs.FileSystem                           - Hadoop is not in the classpath/dependencies. The extended set of supported File Systems via Hadoop is not available.
2019-06-25 05:44:36,376 INFO  org.apache.flink.runtime.security.modules.HadoopModuleFactory  - Cannot create Hadoop Security Module because Hadoop cannot be found in the Classpath.
2019-06-25 05:44:36,395 INFO  org.apache.flink.runtime.security.SecurityUtils               - Cannot install HadoopSecurityContext because Hadoop cannot be found in the Classpath.
2019-06-25 05:44:36,559 WARN  org.apache.flink.configuration.Configuration                  - Config uses deprecated configuration key 'jobmanager.rpc.address' instead of proper key 'rest.address'
2019-06-25 05:44:36,563 INFO  org.apache.flink.runtime.util.LeaderRetrievalUtils            - Trying to select the network interface and address to use by connecting to the leading JobManager.
2019-06-25 05:44:36,564 INFO  org.apache.flink.runtime.util.LeaderRetrievalUtils            - TaskManager will try to connect for 10000 milliseconds before falling back to heuristics
2019-06-25 05:44:36,567 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Retrieved new target address /10.142.0.40:6123.
2019-06-25 05:44:36,571 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - TaskManager will use hostname/address 'ubuntu-test-2' (10.142.15.250) for communication.
2019-06-25 05:44:36,574 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils         - Trying to start actor system at ubuntu-test-2:0
2019-06-25 05:44:36,935 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2019-06-25 05:44:37,004 INFO  akka.remote.Remoting                                          - Starting remoting
2019-06-25 05:44:37,108 INFO  akka.remote.Remoting                                          - Remoting started; listening on addresses :[akka.tcp://flink@ubuntu-test-2:33391]
2019-06-25 05:44:37,115 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils         - Actor system started at akka.tcp://flink@ubuntu-test-2:33391
2019-06-25 05:44:37,121 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - Trying to start actor system at ubuntu-test-2:0
2019-06-25 05:44:37,138 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2019-06-25 05:44:37,144 INFO  akka.remote.Remoting                                          - Starting remoting
2019-06-25 05:44:37,152 INFO  akka.remote.Remoting                                          - Remoting started; listening on addresses :[akka.tcp://flink-metrics@ubuntu-test-2:46253]
2019-06-25 05:44:37,153 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - Actor system started at akka.tcp://flink-metrics@ubuntu-test-2:46253
2019-06-25 05:44:37,166 INFO  org.apache.flink.runtime.metrics.MetricRegistryImpl           - No metrics reporter configured, no metrics will be exposed/reported.
2019-06-25 05:44:37,171 INFO  org.apache.flink.runtime.blob.PermanentBlobCache              - Created BLOB cache storage directory /tmp/blobStore-4219e8ab-64ab-4eff-8320-8a50b550959d
2019-06-25 05:44:37,174 INFO  org.apache.flink.runtime.blob.TransientBlobCache              - Created BLOB cache storage directory /tmp/blobStore-959579c0-4892-4ba8-b7d3-63969e84f554
2019-06-25 05:44:37,175 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - Starting TaskManager with ResourceID: 3743bd08e81673b79e96d98ebab7a58a
2019-06-25 05:44:37,179 INFO  org.apache.flink.runtime.io.network.netty.NettyConfig         - NettyConfig [server address: ubuntu-test-2/10.142.15.250, server port: 0, ssl enabled: false, memory segment size (bytes): 32768, transport type: NIO, number of server threads: 1 (manual), number of client threads: 1 (manual), server connect backlog: 0 (use Netty's default), client connect timeout (sec): 120, send/receive buffer size (bytes): 0 (use Netty's default)]
2019-06-25 05:44:37,224 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerServices     - Temporary file directory '/tmp': total 96 GB, usable 86 GB (89.58% usable)
2019-06-25 05:44:37,305 INFO  org.apache.flink.runtime.io.network.buffer.NetworkBufferPool  - Allocated 102 MB for network buffer pool (number of memory segments: 3278, bytes per segment: 32768).
2019-06-25 05:44:37,354 INFO  org.apache.flink.runtime.query.QueryableStateUtils            - Could not load Queryable State Client Proxy. Probable reason: flink-queryable-state-runtime is not in the classpath. To enable Queryable State, please move the flink-queryable-state-runtime jar from the opt to the lib folder.
2019-06-25 05:44:37,355 INFO  org.apache.flink.runtime.query.QueryableStateUtils            - Could not load Queryable State Server. Probable reason: flink-queryable-state-runtime is not in the classpath. To enable Queryable State, please move the flink-queryable-state-runtime jar from the opt to the lib folder.
2019-06-25 05:44:37,357 INFO  org.apache.flink.runtime.io.network.NetworkEnvironment        - Starting the network environment and its components.
2019-06-25 05:44:37,389 INFO  org.apache.flink.runtime.io.network.netty.NettyClient         - Successful initialization (took 30 ms).
2019-06-25 05:44:37,432 INFO  org.apache.flink.runtime.io.network.netty.NettyServer         - Successful initialization (took 42 ms). Listening on SocketAddress /10.142.15.250:41521.
2019-06-25 05:44:37,433 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerServices     - Limiting managed memory to 0.7 of the currently free heap space (640 MB), memory will be allocated lazily.
2019-06-25 05:44:37,436 INFO  org.apache.flink.runtime.io.disk.iomanager.IOManager          - I/O manager uses directory /tmp/flink-io-9b6408aa-3a29-477b-8a4b-661401bad5b6 for spill files.
2019-06-25 05:44:37,496 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerConfiguration  - Messages have a max timeout of 10000 ms
2019-06-25 05:44:37,503 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Starting RPC endpoint for org.apache.flink.runtime.taskexecutor.TaskExecutor at akka://flink/user/taskmanager_0 .
2019-06-25 05:44:37,520 INFO  org.apache.flink.runtime.taskexecutor.JobLeaderService        - Start job leader service.
2019-06-25 05:44:37,521 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Connecting to ResourceManager akka.tcp://flink@10.142.0.40:6123/user/resourcemanager(00000000000000000000000000000000).
2019-06-25 05:44:37,521 INFO  org.apache.flink.runtime.filecache.FileCache                  - User file cache uses directory /tmp/flink-dist-cache-504118c3-1bc2-4624-b1c4-7eacce681ba9
2019-06-25 05:44:47,542 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:45:07,580 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:45:27,620 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:45:47,660 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:46:07,700 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:46:27,741 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:46:47,780 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:47:07,820 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:47:27,860 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:47:47,900 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:48:07,940 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:48:27,980 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:48:48,020 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:49:08,060 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:49:28,100 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@10.142.0.40:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.tcp://flink@10.142.0.40:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
2019-06-25 05:49:37,541 ERROR org.apache.flink.runtime.taskexecutor.TaskExecutor            - Fatal error occurred in TaskExecutor akka.tcp://flink@ubuntu-test-2:33391/user/taskmanager_0.
org.apache.flink.runtime.taskexecutor.exceptions.RegistrationTimeoutException: Could not register at the ResourceManager within the specified maximum registration duration 300000 ms. This indicates a problem with this instance. Terminating now.
        at org.apache.flink.runtime.taskexecutor.TaskExecutor.registrationTimeout(TaskExecutor.java:1037)
        at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$startRegistrationTimeout$3(TaskExecutor.java:1023)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:332)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:158)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
        at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
        at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
        at akka.actor.ActorCell.invoke(ActorCell.scala:495)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
        at akka.dispatch.Mailbox.run(Mailbox.scala:224)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2019-06-25 05:49:37,544 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - Fatal error occurred while executing the TaskManager. Shutting it down...
org.apache.flink.runtime.taskexecutor.exceptions.RegistrationTimeoutException: Could not register at the ResourceManager within the specified maximum registration duration 300000 ms. This indicates a problem with this instance. Terminating now.
        at org.apache.flink.runtime.taskexecutor.TaskExecutor.registrationTimeout(TaskExecutor.java:1037)
        at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$startRegistrationTimeout$3(TaskExecutor.java:1023)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:332)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:158)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
        at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
        at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
        at akka.actor.ActorCell.invoke(ActorCell.scala:495)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
        at akka.dispatch.Mailbox.run(Mailbox.scala:224)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2019-06-25 05:49:37,550 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Stopping TaskExecutor akka.tcp://flink@ubuntu-test-2:33391/user/taskmanager_0.
2019-06-25 05:49:37,551 INFO  org.apache.flink.runtime.state.TaskExecutorLocalStateStoresManager  - Shutting down TaskExecutorLocalStateStoresManager.
2019-06-25 05:49:37,554 INFO  org.apache.flink.runtime.io.disk.iomanager.IOManager          - I/O manager removed spill file directory /tmp/flink-io-9b6408aa-3a29-477b-8a4b-661401bad5b6
2019-06-25 05:49:37,554 INFO  org.apache.flink.runtime.io.network.NetworkEnvironment        - Shutting down the network environment and its components.
2019-06-25 05:49:37,554 INFO  org.apache.flink.runtime.io.network.netty.NettyClient         - Successful shutdown (took 0 ms).
2019-06-25 05:49:37,555 INFO  org.apache.flink.runtime.io.network.netty.NettyServer         - Successful shutdown (took 0 ms).
2019-06-25 05:49:37,561 INFO  org.apache.flink.runtime.taskexecutor.JobLeaderService        - Stop job leader service.
2019-06-25 05:49:37,562 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Stopped TaskExecutor akka.tcp://flink@ubuntu-test-2:33391/user/taskmanager_0.
2019-06-25 05:49:37,563 INFO  org.apache.flink.runtime.blob.PermanentBlobCache              - Shutting down BLOB cache
2019-06-25 05:49:37,563 INFO  org.apache.flink.runtime.blob.TransientBlobCache              - Shutting down BLOB cache
2019-06-25 05:49:37,570 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopping Akka RPC service.
2019-06-25 05:49:37,576 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2019-06-25 05:49:37,577 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2019-06-25 05:49:37,580 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2019-06-25 05:49:37,584 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2019-06-25 05:49:37,596 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2019-06-25 05:49:37,597 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.                                                                                                                                                                                                                                                                      41,1          Top

Looks like the problem was that I used IP addresses instead of hostnames. This was already pointed out in some other thread on SO. When I read that thread, I thought the reason was because IP addresses can change over time for the same host. Looks like, using IP addresses does not work, even if they don't change.

Wondering why then, in flink documentation, they showed IP addresses. https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/cluster_setup.html

I had same issue,

make sure you are using jdk-1.8 as flink 1.7.2 need jdk-1.8, worked for me!

check if below environment variable set, while docker setup.

FLINK_PROPERTIES="jobmanager.rpc.address: jobmanager"

or check jobmanager.rpc.address configuration in other cases.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM