My Jenkins instance which is located on the EC2 machine (t3.medium) in a private VPC.network, served by the Nginx is losing connection during long builds. The workers are the same type - EC2 machines in the same region/su.net, and the same JAVA version.
Jenkins version: Jenkins 2.319.3
Java: openjdk version "1.8.0_312"
OS: Ubuntu 20.02
Connection is realized by the SSH connection.
What I tried to resolve this issue:
I changed the EC2 type. Due to the fact of not having enough memory, I changed the type, the issue still exists.
Update JAVA version - I upgraded the JAVA to Java 11. Without any effect.
I changed the agent/worker SSHD configuration: (added ClientAliveInterval 80)
I increased the Connection Timeout in Seconds in the worker configuration (60 -> 6000)
I used the option to connect the worker to Jenkins master by command. The connection was still losing.
I configured more aggressive TCPKeepAlive parameters:
sysctl -w.net.ipv4.tcp_keepalive_time=120 sysctl -w.net.ipv4.tcp_keepalive_intvl=30 sysctl -w.net.ipv4.tcp_keepalive_probes=8 sysctl -w.net.ipv4.tcp_fin_timeout=30
I added hudson.slaves.ChannelPinger.pingIntervalSeconds=-1 to the JAVA options
Any ideas what can be wrong here?
Error:
04:01:35 FATAL: command execution failed
04:01:36 java.io.EOFException
04:01:36 at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2799)
04:01:36 at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3274)
04:01:36 at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:934)
04:01:36 at java.io.ObjectInputStream.<init>(ObjectInputStream.java:396)
04:01:36 at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
04:01:36 at hudson.remoting.Command.readFrom(Command.java:142)
04:01:36 at hudson.remoting.Command.readFrom(Command.java:128)
04:01:36 at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
04:01:36 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:61)
04:01:36 Caused: java.io.IOException: Unexpected termination of the channel
04:01:36 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:75)
References:
Nginx conf:
upstream jenkins {
server 127.0.0.1:8080;
}
server {
listen 443 ssl;
server_name XXX.CCC.net;
ssl_certificate /etc/nginx/valid_cert/XXX.pem;
ssl_certificate_key /etc/nginx/valid_cert/XXX.CCC.net.key;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_ciphers HIGH:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4;
ssl_prefer_server_ciphers on;
access_log /var/log/nginx/jenkins.access.log;
ssl_session_cache shared:SSL:10m;
ssl_stapling on;
ssl_stapling_verify on;
location / {
try_files $uri @app;
}
location @app {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_next_upstream error;
proxy_pass http://jenkins;
proxy_redirect http:// https://;
proxy_read_timeout 150;
}
}
I changed the EC2 type. Due to the fact of not having enough memory, I changed the type, the issue still exists.
Update JAVA version - I upgraded the JAVA to Java 11. Without any effect.
I changed the agent/worker SSHD configuration: (added ClientAliveInterval 80)
I increased the Connection Timeout in Seconds in the worker configuration (60 -> 6000)
I used the option to connect the worker to Jenkins master by command. The connection was still losing.
I configured more aggressive TCPKeepAlive parameters:
sysctl -w.net.ipv4.tcp_keepalive_time=120 sysctl -w.net.ipv4.tcp_keepalive_intvl=30 sysctl -w.net.ipv4.tcp_keepalive_probes=8 sysctl -w.net.ipv4.tcp_fin_timeout=30
I added hudson.slaves.ChannelPinger.pingIntervalSeconds=-1 to the JAVA options
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.