Elasticsearch in Docker No route to host

Question

I have a problem with Elasticsearch 5 in Docker.

Stack compose file:

version: "3.4"

services:
    elastic01: &elasticbase
        image: docker.elastic.co/elasticsearch/elasticsearch:5.6.7
        networks:
             - default
        restart: always   
        environment:
            - node.name=elastic01
            - cluster.name=elastic
            - network.host=0.0.0.0
            - xpack.security.enabled=false
            - xpack.monitoring.enabled=false
            - xpack.watcher.enabled=false    
            - bootstrap.memory_lock=false      ## Docker swarm does not support that    
            - discovery.zen.minimum_master_nodes=2      
            - discovery.zen.ping.unicast.hosts=elastic02,elastic03       
        volumes:
            - /var/docker/elastic:/usr/share/elasticsearch/data
        deploy:
            placement:
                constraints: [node.hostname == node1]

    elastic02:
        <<: *elasticbase
        depends_on:
            - elastic01
        environment:
            - node.name=elastic02    
            - cluster.name=elastic
            - network.host=0.0.0.0
            - xpack.security.enabled=false
            - xpack.monitoring.enabled=false
            - xpack.watcher.enabled=false
            - bootstrap.memory_lock=false      ## Docker swarm does not support that 
            - discovery.zen.minimum_master_nodes=2
            - discovery.zen.ping.unicast.hosts=elastic01,elastic03   
        volumes:
            - /var/docker/elastic:/usr/share/elasticsearch/data
        deploy:
            placement:
                constraints: [node.hostname == node2]

    elastic03:
        <<: *elasticbase
        depends_on:
            - elastic01
        volumes:
            - /var/docker/elastic:/usr/share/elasticsearch/data
        environment:
            - node.name=elastic03    
            - cluster.name=elastic
            - network.host=0.0.0.0
            - xpack.security.enabled=false
            - bootstrap.memory_lock=false      ## Docker swarm does not support that         
            - discovery.zen.minimum_master_nodes=2
            - discovery.zen.ping.unicast.hosts=elastic01,elastic02  
        deploy:
            placement:
                constraints: [node.hostname == node3]       

    networks:
        default:
            driver: overlay
            attachable: true

When I run stack file, it works like a charm. _cluster/health shows that nodes are up and running a the status is "Green" but after while, periodically, system goes down with exception Elastic exception

Feb 10 09:39:39 : [2018-02-10T08:39:39,159][WARN ][o.e.d.z.UnicastZenPing   ] [elastic01] failed to send ping to [{elastic03}{2WS6GPu8Qka9YLE_PWfVKg}{AD_Nw1m9T-CZHUFhgXQjtQ}{10.0.9.5}{10.0.9.5:9300}{ml.max_open_jobs=10, ml.enabled=true}] 
Feb 10 09:39:39 : org.elasticsearch.transport.ReceiveTimeoutTransportException: [elastic03][10.0.9.5:9300][internal:discovery/zen/unicast] request_id [5167] timed out after [3750ms] 
Feb 10 09:39:39 :   at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:961) [elasticsearch-5.6.7.jar:5.6.7] 
Feb 10 09:39:39 :   at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.7.jar:5.6.7] 
Feb 10 09:39:39 :   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161] 
Feb 10 09:39:39 :   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161] 
Feb 10 09:39:39 :   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161] 
Feb 10 09:39:40 : [2018-02-10T08:39:40,159][WARN ][o.e.d.z.UnicastZenPing   ] [elastic01] failed to send ping to [{elastic03}{2WS6GPu8Qka9YLE_PWfVKg}{AD_Nw1m9T-CZHUFhgXQjtQ}{10.0.9.5}{10.0.9.5:9300}{ml.max_open_jobs=10, ml.enabled=true}] 
Feb 10 09:39:40 : org.elasticsearch.transport.ReceiveTimeoutTransportException: [elastic03][10.0.9.5:9300][internal:discovery/zen/unicast] request_id [5172] timed out after [3750ms] 
Feb 10 09:39:40 :   at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:961) [elasticsearch-5.6.7.jar:5.6.7] 
Feb 10 09:39:40 :   at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.7.jar:5.6.7] 
Feb 10 09:39:40 :   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161] 
Feb 10 09:39:40 :   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161] 
Feb 10 09:39:40 :   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161] 
Feb 10 09:39:41 : [2018-02-10T08:39:41,159][WARN ][o.e.d.z.UnicastZenPing   ] [elastic01] failed to send ping to [{elastic03}{2WS6GPu8Qka9YLE_PWfVKg}{AD_Nw1m9T-CZHUFhgXQjtQ}{10.0.9.5}{10.0.9.5:9300}{ml.max_open_jobs=10, ml.enabled=true}] 
Feb 10 09:39:41 : org.elasticsearch.transport.ReceiveTimeoutTransportException: [elastic03][10.0.9.5:9300][internal:discovery/zen/unicast] request_id [5175] timed out after [3751ms] 
Feb 10 09:39:41 :   at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:961) [elasticsearch-5.6.7.jar:5.6.7] 
Feb 10 09:39:41 :   at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.7.jar:5.6.7] 
Feb 10 09:39:41 :   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161] 
Feb 10 09:39:41 :   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161] 
Feb 10 09:39:41 :   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]

And sometimes:

Feb 10 09:44:10 [2018-02-10T08:44:10,810][WARN ][o.e.t.n.Netty4Transport  ] [elastic01] exception caught on transport layer [[id: 0x3675891a, L:/10.0.9.210:53316 - R:10.0.9.5/10.0.9.5:9300]], closing connection 
Feb 10 09:44:10 java.io.IOException: No route to host 
Feb 10 09:44:10     at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:?] 
Feb 10 09:44:10     at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:?] 
Feb 10 09:44:10     at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:?] 
Feb 10 09:44:10     at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[?:?] 
Feb 10 09:44:10     at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:?] 
Feb 10 09:44:10     at io.netty.buffer.PooledHeapByteBuf.setBytes(PooledHeapByteBuf.java:261) ~[netty-buffer-4.1.13.Final.jar:4.1.13.Final] 
Feb 10 09:44:10     at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1100) ~[netty-buffer-4.1.13.Final.jar:4.1.13.Final] 
Feb 10 09:44:10     at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:372) ~[netty-transport-4.1.13.Final.jar:4.1.13.Final] 
Feb 10 09:44:10     at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:123) [netty-transport-4.1.13.Final.jar:4.1.13.Final] 
Feb 10 09:44:10     at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) [netty-transport-4.1.13.Final.jar:4.1.13.Final] 
Feb 10 09:44:10     at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544) [netty-transport-4.1.13.Final.jar:4.1.13.Final] 
Feb 10 09:44:10     at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) [netty-transport-4.1.13.Final.jar:4.1.13.Final] 
Feb 10 09:44:10     at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) [netty-transport-4.1.13.Final.jar:4.1.13.Final] 
Feb 10 09:44:10     at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.13.Final.jar:4.1.13.Final] 
Feb 10 09:44:10     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]

Strange thing is that all the time it happens I am able to ping the container and resolve the name from the container where it happens. No packet loss, no timeouts. The only bad thing is the transport layer in the Elastic. All other services are running in the same cluster without issues (MongoDB, Redis, Internal microservices)

Does anybody have a clue?

Answer 1

I found the issue.

Elasticsearch must be binded to a one single interface, not to 0.0.0.0. Once I binded it to eth0, it started to work. Also it looks, there cannot be named volume - it throws another error during the time. It must be mounted to a local drive directly.

This works:

services:
    elastic01:
        environment:
            network.host=_eth0_

Elasticsearch in Docker No route to host

Question

1 answers

solution1
0 2018-02-13 07:51:49

Elasticsearch in Docker No route to host

Question

1 answers

solution1 0 2018-02-13 07:51:49

solution1
0 2018-02-13 07:51:49