简体   繁体   English

Jenkins slave pod 离线

[英]Jenkins slave pod is offline

I am having issue with slave pods not being able to connect to Jenkins master.我遇到了从属 Pod 无法连接到 Jenkins 主机的问题。

This is the Jenkins build output这是 Jenkins 构建输出

[Pipeline] Start of Pipeline
[Pipeline] podTemplate
[Pipeline] {
[Pipeline] node
Still waiting to schedule task
‘ci-xprj2-2z8qp’ is offline

I can see this in the Jenkins pod log我可以在 Jenkins pod 日志中看到这一点

2020-09-24 20:16:57.778+0000 [id=6228]  INFO    o.c.j.p.k.KubernetesLauncher#launch: Created Pod: infrastructure/ci-xprj2-2tqzn
2020-09-24 20:16:57.778+0000 [id=24]    INFO    hudson.slaves.NodeProvisioner#lambda$update$6: Kubernetes Pod Template provisioning successfully completed. We have now 2 computer(s)
2020-09-24 20:16:57.779+0000 [id=24]    INFO    o.c.j.p.k.KubernetesCloud#provision: Excess workload after pending Kubernetes agents: 0
2020-09-24 20:16:57.779+0000 [id=24]    INFO    o.c.j.p.k.KubernetesCloud#provision: Template for label ci: Kubernetes Pod Template
2020-09-24 20:16:57.839+0000 [id=5801]  INFO    o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-09-24 20:16:59.902+0000 [id=6228]  INFO    o.c.j.p.k.KubernetesLauncher#launch: Pod is running: infrastructure/ci-xprj2-2tqzn
2020-09-24 20:16:59.906+0000 [id=6228]  INFO    o.c.j.p.k.KubernetesLauncher#launch: Waiting for agent to connect (0/100): ci-xprj2-2tqzn
2020-09-24 20:17:00.911+0000 [id=6228]  INFO    o.c.j.p.k.KubernetesLauncher#launch: Waiting for agent to connect (1/100): ci-xprj2-2tqzn
2020-09-24 20:17:01.917+0000 [id=6228]  INFO    o.c.j.p.k.KubernetesLauncher#launch: Waiting for agent to connect (2/100): ci-xprj2-2tqzn

The log from ci-xprj2-2tqzn shows this:来自ci-xprj2-2tqzn的日志显示:

Sep 24, 2020 8:18:59 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: ci-xprj2-29g0p
Sep 24, 2020 8:18:59 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Sep 24, 2020 8:18:59 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 4.3
Sep 24, 2020 8:18:59 PM hudson.remoting.Engine startEngine
WARNING: No Working Directory. Using the legacy JAR Cache location: /home/jenkins/.jenkins/cache/jars
Sep 24, 2020 8:18:59 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://jenkins1:8080/]
Sep 24, 2020 8:19:19 PM hudson.remoting.jnlp.Main$CuiListener error
SEVERE: Failed to connect to http://jenkins1:8080/tcpSlaveAgentListener/: jenkins1
java.io.IOException: Failed to connect to http://jenkins1:8080/tcpSlaveAgentListener/: jenkins1
    at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:217)
    at hudson.remoting.Engine.innerRun(Engine.java:693)
    at hudson.remoting.Engine.run(Engine.java:518)
Caused by: java.net.UnknownHostException: jenkins1
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:607)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
    at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
    at sun.net.www.http.HttpClient.New(HttpClient.java:339)
    at sun.net.www.http.HttpClient.New(HttpClient.java:357)
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1226)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
    at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:214)
    ... 2 more

My Jenkins config looks like this:我的 Jenkins 配置如下所示: 在此处输入图片说明

在此处输入图片说明

Any help?有什么帮助吗?

It looks like the error to focus on would be:看起来要关注的错误是:

SEVERE: Failed to connect to http://jenkins1:8080/tcpSlaveAgentListener/: jenkins1
java.io.IOException: Failed to connect to http://jenkins1:8080/tcpSlaveAgentListener/: jenkins1
...
Caused by: java.net.UnknownHostException

which means jenkins1 can't be resolved .这意味着jenkins1 无法解析

  • If jenkins1 corresponds to a Kubernetes service name, I would double check its name and details and then spin up another pod in your namespace that sleeps for a while so that you can exec in and see if you can resolve jenkins1 .如果jenkins1对应于 Kubernetes 服务名称,我会仔细检查它的名称和详细信息,然后在您的命名空间中启动另一个休眠一段时间的 pod,以便您可以执行并查看是否可以解析jenkins1
kubectl exec -it <sleep-test-pod-name> /bin/bash

ping jenkins1
nslookup jenkins1  #install nslookup if not already installed
  • If jenkins1 corresponds to one of those single word domains you sometimes see at corporations, then I would double check your search prefixes in /etc/resolv.conf in your pods:如果jenkins1对应于您有时在公司中看到的那些单字域之一,那么我会在您的 pod 中的/etc/resolv.conf中仔细检查您的搜索前缀:
cat /etc/resolv.conf

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM