简体   繁体   中英

Jenkins hosted in EC2 losing connection with EC2 workers

My Jenkins instance which is located on the EC2 machine (t3.medium) in a private VPC.network, served by the Nginx is losing connection during long builds. The workers are the same type - EC2 machines in the same region/su.net, and the same JAVA version.

Jenkins version: Jenkins 2.319.3
Java: openjdk version "1.8.0_312"
OS: Ubuntu 20.02

Connection is realized by the SSH connection.

What I tried to resolve this issue:

  1. I changed the EC2 type. Due to the fact of not having enough memory, I changed the type, the issue still exists.

  2. Update JAVA version - I upgraded the JAVA to Java 11. Without any effect.

  3. I changed the agent/worker SSHD configuration: (added ClientAliveInterval 80)

  4. I increased the Connection Timeout in Seconds in the worker configuration (60 -> 6000)

  5. I used the option to connect the worker to Jenkins master by command. The connection was still losing.

  6. I configured more aggressive TCPKeepAlive parameters:

     sysctl -w.net.ipv4.tcp_keepalive_time=120 sysctl -w.net.ipv4.tcp_keepalive_intvl=30 sysctl -w.net.ipv4.tcp_keepalive_probes=8 sysctl -w.net.ipv4.tcp_fin_timeout=30
  7. I added hudson.slaves.ChannelPinger.pingIntervalSeconds=-1 to the JAVA options

Any ideas what can be wrong here?

Error:

04:01:35 FATAL: command execution failed
04:01:36 java.io.EOFException
04:01:36    at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2799)
04:01:36    at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3274)
04:01:36    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:934)
04:01:36    at java.io.ObjectInputStream.<init>(ObjectInputStream.java:396)
04:01:36    at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
04:01:36    at hudson.remoting.Command.readFrom(Command.java:142)
04:01:36    at hudson.remoting.Command.readFrom(Command.java:128)
04:01:36    at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
04:01:36    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:61)
04:01:36 Caused: java.io.IOException: Unexpected termination of the channel
04:01:36    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:75)

References:

Nginx conf:

upstream jenkins {
  server 127.0.0.1:8080;
}

server {

    listen 443 ssl;
    server_name XXX.CCC.net;

    ssl_certificate           /etc/nginx/valid_cert/XXX.pem;
    ssl_certificate_key       /etc/nginx/valid_cert/XXX.CCC.net.key;
 
    ssl_protocols  TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers HIGH:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4;
    ssl_prefer_server_ciphers on;

    access_log            /var/log/nginx/jenkins.access.log;

    ssl_session_cache shared:SSL:10m;
    ssl_stapling on;
    ssl_stapling_verify on;

    location / {
      try_files $uri @app;
    }


    location @app {
      proxy_set_header Host $host;
      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header X-Forwarded-Proto $scheme;
      proxy_next_upstream error;
      proxy_pass http://jenkins;
      proxy_redirect http:// https://;
      proxy_read_timeout 150;
    }
  }
  1. I changed the EC2 type. Due to the fact of not having enough memory, I changed the type, the issue still exists.

  2. Update JAVA version - I upgraded the JAVA to Java 11. Without any effect.

  3. I changed the agent/worker SSHD configuration: (added ClientAliveInterval 80)

  4. I increased the Connection Timeout in Seconds in the worker configuration (60 -> 6000)

  5. I used the option to connect the worker to Jenkins master by command. The connection was still losing.

  6. I configured more aggressive TCPKeepAlive parameters:

     sysctl -w.net.ipv4.tcp_keepalive_time=120 sysctl -w.net.ipv4.tcp_keepalive_intvl=30 sysctl -w.net.ipv4.tcp_keepalive_probes=8 sysctl -w.net.ipv4.tcp_fin_timeout=30
  7. I added hudson.slaves.ChannelPinger.pingIntervalSeconds=-1 to the JAVA options

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM