简体   繁体   English

与Jetty的TIME_WAIT连接过多

[英]Too many TIME_WAIT connections with Jetty

I am running an API on 10 different servers, all of them are behind a firewall. 我正在10个不同的服务器上运行API,所有这些服务器都位于防火墙后面。 I am using jetty 8 to serve all the http requests. 我正在使用码头8来满足所有http请求。 The use case for this API is short lived connections. 该API的用例是短暂的连接。

A few month ago I started to get random Too many open file descriptors errors. 几个月前,我开始随机出现Too many open file descriptors错误。 These errors make the server completely unresponsive and I need to restart the jetty server in order to fix that. 这些错误使服务器完全没有响应,我需要重新启动码头服务器才能修复该问题。 Today this happened 0-10 times a day depending on the traffic I am getting. 今天,这一天发生0到10次,具体取决于我获得的流量。

After some investigations, I noticed that I am exhausting the number of available connections because all of them are stuck in the TIME_WAIT state so I can't create new ones. 经过一些调查,我注意到我耗尽了可用连接的数量,因为所有连接都停留在TIME_WAIT状态,因此无法创建新连接。

ss -s

TCP:   13392 (estab 1549, closed 11439, orphaned 9, synrecv 0, timewait *11438*/0), ports 932

On this example the number of connections in TIME_WAIT state is pretty low but it can go up to 50k. 在此示例中,处于TIME_WAIT状态的连接数量非常少,但最多可以达到50k。

I have been trying several kernel tweaks and I also tried to set the SO_LINGER timer to 1 second for jetty sockets. 我一直在尝试几次内核调整,并且还尝试将Jetty套接字的SO_LINGER计时器设置为1秒。 All these changes helped reduce the frequency but I am still getting errors regularly. 所有这些更改有助于降低频率,但我仍然经常出现错误。

Also worth mentioning, I am receiving around 3k requests/second on each server and the cpu usage is very low. 还值得一提的是,我在每台服务器上每秒接收大约3k个请求,而CPU的使用率非常低。 The bottleneck to scale my traffic today is this connection issue. 今天扩展我的流量的瓶颈是此连接问题。

Does anyone have an idea of what I can do to handle that correctly ? 有人知道我该怎么做才能正确处理吗?

'Too many open file descriptors' is probably caused by a resource leak in your application. “打开的文件描述符太多”可能是由于应用程序中的资源泄漏引起的。

The TIME_WAIT state is caused by being the end that first sends a close, instead of the end that first receives the close. TIME_WAIT状态是由最先发送关闭的一端而不是最先收到关闭的一端引起的。 You might want to reconsider your application protocol so that it is the client which closes first. 您可能需要重新考虑您的应用程序协议,以便首先关闭客户端。 This is not too hard to arrange. 这并不难安排。 It falls out free if you use client-side connection pooling for example. 例如,如果使用客户端连接池,它将完全免费。

These two conditions are not related. 这两个条件无关。 The TIME_WAIT state can only occur on a port whose socket has already been closed. TIME_WAIT状态只能在套接字已关闭的端口上发生。 It does not cause 'too many open file descriptors' problems. 它不会导致“打开文件描述符过多”的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM