[英]Using testcontainers in a Jenkins Docker Agent: containers fail to start, NoRouteToHostException
I'm using a Jenkins declarative pipeline with Docker Agents to build and test my software, including running integration tests using testcontainers.我正在使用带有 Docker 代理的 Jenkins 声明式管道来构建和测试我的软件,包括使用测试容器运行集成测试。 I can run my testcontainers tests OK in my development environment (not using Jenkins), but they fail under Jenkins.
我可以在我的开发环境(不使用 Jenkins)中运行我的 testcontainers 测试,但它们在 Jenkins 下失败。
The testcontainers Ryuk resource reaping daemon does not work testcontainers Ryuk 资源收割守护进程不工作
16:29:20.255 [testcontainers-ryuk] WARN o.t.utility.ResourceReaper - Can not connect to Ryuk at 172.17.0.1:32769
java.net.NoRouteToHostException: No route to host (Host unreachable)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
at java.base/java.net.Socket.connect(Socket.java:591)
at java.base/java.net.Socket.connect(Socket.java:540)
at java.base/java.net.Socket.<init>(Socket.java:436)
at java.base/java.net.Socket.<init>(Socket.java:213)
at org.testcontainers.utility.ResourceReaper.lambda$start$1(ResourceReaper.java:112)
at java.base/java.lang.Thread.run(Thread.java:834)
I was able to work around that problem by disabling the daemon by setting the TESTCONTAINERS_RYUK_DISABLED
environment variable to true
.我能够通过将
TESTCONTAINERS_RYUK_DISABLED
环境变量设置为true
来禁用守护进程来解决该问题。 But some of the integration tests still repeatedly fail, however.但是,一些集成测试仍然反复失败。
An integration test using an ElasticsearchContainer
repeatedly fails to start: it times out waiting for the HTTP port to respond.使用
ElasticsearchContainer
的集成测试反复启动失败:等待 HTTP 端口响应超时。
17:04:57.595 [main] INFO d.e.c.7.1] - Starting container with ID: f5c653442103b9073c76f6ed91fc9117f7cb388d576606be8bd85bd9f3b2051d
17:04:58.465 [main] INFO d.e.c.7.1] - Container docker.elastic.co/elasticsearch/elasticsearch:6.7.1 is starting: f5c653442103b9073c76f6ed91fc9117f7cb388d576606be8bd85bd9f3b2051d
17:04:58.479 [main] INFO o.t.c.wait.strategy.HttpWaitStrategy - /loving_swartz: Waiting for 240 seconds for URL: http://172.17.0.1:32833/
17:08:58.480 [main] ERROR d.e.c.7.1] - Could not start container
org.testcontainers.containers.ContainerLaunchException: Timed out waiting for URL to be accessible (http://172.17.0.1:32833/ should return HTTP 200)
at org.testcontainers.containers.wait.strategy.HttpWaitStrategy.waitUntilReady(HttpWaitStrategy.java:197)
at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:35)
at org.testcontainers.containers.GenericContainer.waitUntilContainerStarted(GenericContainer.java:582)
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:259)
at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:212)
at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:76)
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:210)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:199)
at
...
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
17:08:58.513 [main] ERROR d.e.c.7.1] - Log output from the failed container:
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
OpenJDK 64-Bit Server VM warning: UseAVX=2 is not supported on this CPU, setting it to UseAVX=0
[2019-04-11T17:05:02,527][INFO ][o.e.e.NodeEnvironment ] [1a_XhBT] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [1.2tb], net total_space [1.2tb], types [rootfs]
[2019-04-11T17:05:02,532][INFO ][o.e.e.NodeEnvironment ] [1a_XhBT] heap size [989.8mb], compressed ordinary object pointers [true]
[2019-04-11T17:05:02,536][INFO ][o.e.n.Node ] [1a_XhBT] node name derived from node ID [1a_XhBTfQZWw1XLZMXrp4A]; set [node.name] to override
[2019-04-11T17:05:02,536][INFO ][o.e.n.Node ] [1a_XhBT] version[6.7.1], pid[1], build[default/docker/2f32220/2019-04-02T15:59:27.961366Z], OS[Linux/3.10.0-957.10.1.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/12/12+33]
[2019-04-11T17:05:02,536][INFO ][o.e.n.Node ] [1a_XhBT] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch-14081126934203442674, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -XX:UseAVX=2, -Des.cgroups.hierarchy.override=/, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=default, -Des.distribution.type=docker]
...
[2019-04-11T17:05:16,338][INFO ][o.e.d.DiscoveryModule ] [1a_XhBT] using discovery type [single-node] and host providers [settings]
[2019-04-11T17:05:17,795][INFO ][o.e.n.Node ] [1a_XhBT] initialized
[2019-04-11T17:05:17,795][INFO ][o.e.n.Node ] [1a_XhBT] starting ...
[2019-04-11T17:05:18,086][INFO ][o.e.t.TransportService ] [1a_XhBT] publish_address {172.28.0.3:9300}, bound_addresses {0.0.0.0:9300}
[2019-04-11T17:05:18,128][WARN ][o.e.b.BootstrapChecks ] [1a_XhBT] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2019-04-11T17:05:18,299][INFO ][o.e.h.n.Netty4HttpServerTransport] [1a_XhBT] publish_address {172.28.0.3:9200}, bound_addresses {0.0.0.0:9200}
[2019-04-11T17:05:18,299][INFO ][o.e.n.Node ] [1a_XhBT] started
[2019-04-11T17:05:18,461][WARN ][o.e.x.s.a.s.m.NativeRoleMappingStore] [1a_XhBT] Failed to clear cache for realms [[]]
[2019-04-11T17:05:18,542][INFO ][o.e.g.GatewayService ] [1a_XhBT] recovered [0] indices into cluster_state
[2019-04-11T17:05:18,822][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.watch-history-9] for index patterns [.watcher-history-9*]
[2019-04-11T17:05:18,871][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.watches] for index patterns [.watches*]
[2019-04-11T17:05:18,906][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.triggered_watches] for index patterns [.triggered_watches*]
[2019-04-11T17:05:18,955][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-logstash] for index patterns [.monitoring-logstash-6-*]
[2019-04-11T17:05:19,017][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-es] for index patterns [.monitoring-es-6-*]
[2019-04-11T17:05:19,054][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-alerts] for index patterns [.monitoring-alerts-6]
[2019-04-11T17:05:19,100][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-beats] for index patterns [.monitoring-beats-6-*]
[2019-04-11T17:05:19,148][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-kibana] for index patterns [.monitoring-kibana-6-*]
[2019-04-11T17:05:19,480][INFO ][o.e.l.LicenseService ] [1a_XhBT] license [17853035-5cf6-49c8-96ca-4d14b26325f6] mode [basic] - valid
Yet the Elasticsearch log file looks OK, and includes the last log message that Elasticsearch writes during start up (about the license).然而 Elasticsearch 日志文件看起来没问题,并且包含 Elasticsearch 在启动期间写入的最后一条日志消息(关于许可证)。
Manually changing that container to use a HostPortWaitStrategy
instead of the default HttpWaitStrategy
did not help.手动更改该容器以使用
HostPortWaitStrategy
而不是默认的HttpWaitStrategy
没有帮助。
While trying to investigate or work around this problem, I changed my test code to explicitly start the Docker network, by calling network.getId()
for the testcontainers Network
object.在尝试调查或解决此问题时,我更改了测试代码以通过为 testcontainers
Network
对象调用network.getId()
来显式启动 Docker 网络。 That then failed with a NoRouteToHostException
.然后失败并出现
NoRouteToHostException
。
How do I fix this?我该如何解决?
After some experimentation, I've discovered the cause of the problem.经过一些实验,我发现了问题的原因。 The crucial action is trying to create a Docker bridge network (using
docker network create
, or a testcontainers Network
object) inside a Docker container that is itself running in a Docker bridge network.关键操作是尝试在本身运行在 Docker 桥接网络中的 Docker 容器内创建 Docker 桥接网络(使用 docker
docker network create
或 testcontainers Network
对象)。 If you do this you will not get an error message from Docker, nor will the Docker daemon log file include any useful messages.如果这样做,您将不会从 Docker 收到错误消息,Docker 守护程序日志文件也不会包含任何有用的消息。 But attempts to use the network will result in there being "no route to host".
但是尝试使用网络将导致“没有到主机的路由”。
I fixed the problem by giving my outermost Docker containers (the Jenkins Agents) access to the host network, by having Jenkins provide a --network="host"
option to its docker run
command:我通过让 Jenkins 为其 docker
docker run
命令提供--network="host"
选项,让我最外层的 Docker 容器(Jenkins 代理)访问主机网络来解决这个问题:
pipeline {
agent {
dockerfile {
filename 'Dockerfile.jenkinsAgent'
additionalBuildArgs ...
args '-v /var/run/docker.sock:/var/run/docker.sock ... --network="host" -u jenkins:docker'
}
}
stages {
...
That is OK because the Jenkins Agents do not need the level of isolation given by a bridge network.这没关系,因为 Jenkins 代理不需要桥接网络提供的隔离级别。
In my case it was enough to add two arguments to Docker agent options:在我的例子中,向 Docker 代理选项添加两个参数就足够了:
--group-add
parameter with ID of docker group--group-add
参数pipeline {
agent any
stages {
stage('Gradle build') {
agent {
docker {
reuseNode true
image 'openjdk:11.0-jdk-slim'
args '-v /var/run/docker.sock:/var/run/docker.sock --group-add 992'
}
}
steps {
sh 'env | sort'
sh './gradlew build --no-daemon --stacktrace'
}
}
} // stages
} // pipeline
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.