简体   繁体   中英

Using testcontainers in a Jenkins Docker Agent: containers fail to start, NoRouteToHostException

I'm using a Jenkins declarative pipeline with Docker Agents to build and test my software, including running integration tests using testcontainers. I can run my testcontainers tests OK in my development environment (not using Jenkins), but they fail under Jenkins.

The testcontainers Ryuk resource reaping daemon does not work

16:29:20.255 [testcontainers-ryuk] WARN  o.t.utility.ResourceReaper - Can not connect to Ryuk at 172.17.0.1:32769
java.net.NoRouteToHostException: No route to host (Host unreachable)
    at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
    at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
    at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
    at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
    at java.base/java.net.Socket.connect(Socket.java:591)
    at java.base/java.net.Socket.connect(Socket.java:540)
    at java.base/java.net.Socket.<init>(Socket.java:436)
    at java.base/java.net.Socket.<init>(Socket.java:213)
    at org.testcontainers.utility.ResourceReaper.lambda$start$1(ResourceReaper.java:112)
    at java.base/java.lang.Thread.run(Thread.java:834)

I was able to work around that problem by disabling the daemon by setting the TESTCONTAINERS_RYUK_DISABLED environment variable to true . But some of the integration tests still repeatedly fail, however.

An integration test using an ElasticsearchContainer repeatedly fails to start: it times out waiting for the HTTP port to respond.

17:04:57.595 [main] INFO  d.e.c.7.1] - Starting container with ID: f5c653442103b9073c76f6ed91fc9117f7cb388d576606be8bd85bd9f3b2051d
17:04:58.465 [main] INFO  d.e.c.7.1] - Container docker.elastic.co/elasticsearch/elasticsearch:6.7.1 is starting: f5c653442103b9073c76f6ed91fc9117f7cb388d576606be8bd85bd9f3b2051d
17:04:58.479 [main] INFO  o.t.c.wait.strategy.HttpWaitStrategy - /loving_swartz: Waiting for 240 seconds for URL: http://172.17.0.1:32833/
17:08:58.480 [main] ERROR d.e.c.7.1] - Could not start container
org.testcontainers.containers.ContainerLaunchException: Timed out waiting for URL to be accessible (http://172.17.0.1:32833/ should return HTTP 200)
    at org.testcontainers.containers.wait.strategy.HttpWaitStrategy.waitUntilReady(HttpWaitStrategy.java:197)
    at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:35)
    at org.testcontainers.containers.GenericContainer.waitUntilContainerStarted(GenericContainer.java:582)
    at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:259)
    at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:212)
    at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:76)
    at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:210)
    at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:199)
    at
...
    at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
17:08:58.513 [main] ERROR d.e.c.7.1] - Log output from the failed container:
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.

OpenJDK 64-Bit Server VM warning: UseAVX=2 is not supported on this CPU, setting it to UseAVX=0

[2019-04-11T17:05:02,527][INFO ][o.e.e.NodeEnvironment    ] [1a_XhBT] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [1.2tb], net total_space [1.2tb], types [rootfs]

[2019-04-11T17:05:02,532][INFO ][o.e.e.NodeEnvironment    ] [1a_XhBT] heap size [989.8mb], compressed ordinary object pointers [true]

[2019-04-11T17:05:02,536][INFO ][o.e.n.Node               ] [1a_XhBT] node name derived from node ID [1a_XhBTfQZWw1XLZMXrp4A]; set [node.name] to override

[2019-04-11T17:05:02,536][INFO ][o.e.n.Node               ] [1a_XhBT] version[6.7.1], pid[1], build[default/docker/2f32220/2019-04-02T15:59:27.961366Z], OS[Linux/3.10.0-957.10.1.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/12/12+33]

[2019-04-11T17:05:02,536][INFO ][o.e.n.Node               ] [1a_XhBT] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch-14081126934203442674, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -XX:UseAVX=2, -Des.cgroups.hierarchy.override=/, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=default, -Des.distribution.type=docker]

...

[2019-04-11T17:05:16,338][INFO ][o.e.d.DiscoveryModule    ] [1a_XhBT] using discovery type [single-node] and host providers [settings]

[2019-04-11T17:05:17,795][INFO ][o.e.n.Node               ] [1a_XhBT] initialized

[2019-04-11T17:05:17,795][INFO ][o.e.n.Node               ] [1a_XhBT] starting ...

[2019-04-11T17:05:18,086][INFO ][o.e.t.TransportService   ] [1a_XhBT] publish_address {172.28.0.3:9300}, bound_addresses {0.0.0.0:9300}

[2019-04-11T17:05:18,128][WARN ][o.e.b.BootstrapChecks    ] [1a_XhBT] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

[2019-04-11T17:05:18,299][INFO ][o.e.h.n.Netty4HttpServerTransport] [1a_XhBT] publish_address {172.28.0.3:9200}, bound_addresses {0.0.0.0:9200}

[2019-04-11T17:05:18,299][INFO ][o.e.n.Node               ] [1a_XhBT] started

[2019-04-11T17:05:18,461][WARN ][o.e.x.s.a.s.m.NativeRoleMappingStore] [1a_XhBT] Failed to clear cache for realms [[]]

[2019-04-11T17:05:18,542][INFO ][o.e.g.GatewayService     ] [1a_XhBT] recovered [0] indices into cluster_state

[2019-04-11T17:05:18,822][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.watch-history-9] for index patterns [.watcher-history-9*]

[2019-04-11T17:05:18,871][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.watches] for index patterns [.watches*]

[2019-04-11T17:05:18,906][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.triggered_watches] for index patterns [.triggered_watches*]

[2019-04-11T17:05:18,955][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-logstash] for index patterns [.monitoring-logstash-6-*]

[2019-04-11T17:05:19,017][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-es] for index patterns [.monitoring-es-6-*]

[2019-04-11T17:05:19,054][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-alerts] for index patterns [.monitoring-alerts-6]

[2019-04-11T17:05:19,100][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-beats] for index patterns [.monitoring-beats-6-*]

[2019-04-11T17:05:19,148][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-kibana] for index patterns [.monitoring-kibana-6-*]

[2019-04-11T17:05:19,480][INFO ][o.e.l.LicenseService     ] [1a_XhBT] license [17853035-5cf6-49c8-96ca-4d14b26325f6] mode [basic] - valid

Yet the Elasticsearch log file looks OK, and includes the last log message that Elasticsearch writes during start up (about the license).

Manually changing that container to use a HostPortWaitStrategy instead of the default HttpWaitStrategy did not help.

While trying to investigate or work around this problem, I changed my test code to explicitly start the Docker network, by calling network.getId() for the testcontainers Network object. That then failed with a NoRouteToHostException .

How do I fix this?

After some experimentation, I've discovered the cause of the problem. The crucial action is trying to create a Docker bridge network (using docker network create , or a testcontainers Network object) inside a Docker container that is itself running in a Docker bridge network. If you do this you will not get an error message from Docker, nor will the Docker daemon log file include any useful messages. But attempts to use the network will result in there being "no route to host".

I fixed the problem by giving my outermost Docker containers (the Jenkins Agents) access to the host network, by having Jenkins provide a --network="host" option to its docker run command:

pipeline {
    agent {
        dockerfile {
            filename 'Dockerfile.jenkinsAgent'
            additionalBuildArgs  ...
            args '-v /var/run/docker.sock:/var/run/docker.sock ... --network="host" -u jenkins:docker'
       }
    }
    stages {
...

That is OK because the Jenkins Agents do not need the level of isolation given by a bridge network.

In my case it was enough to add two arguments to Docker agent options:

  • Docker socket as volume
  • add --group-add parameter with ID of docker group
pipeline {
    agent any
    stages {
        stage('Gradle build') {
            agent {
                docker {
                    reuseNode true
                    image 'openjdk:11.0-jdk-slim'
                    args  '-v /var/run/docker.sock:/var/run/docker.sock --group-add 992'
                }
            }

            steps {
                sh 'env | sort'
                sh './gradlew build --no-daemon --stacktrace'
            }
        }
    } // stages
} // pipeline

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM