简体繁体 English

Apache HTTP客户端：使用多线程环境构建模拟器

[英]Apache HTTP Client: build simulator using multithreaded environment

原文 2014-12-05 22:08:08 5 2 java/ multithreading/ apache-httpclient-4.x/ java.util.concurrent

I am building a standalone java application to generate load on a system, simulating real world conditions. 我正在构建一个独立的Java应用程序，以在系统上生成负载，以模拟实际情况。

The application is multithreaded, using the concurrent framework to generate lots of pooled threads, each of which runs "sessions". 该应用程序是多线程的，使用并发框架生成大量池线程，每个池线程都运行“会话”。 When the session is complete, the runnable ends and the thread is returned to the scheduler pool. 会话完成后，可运行结束，线程返回到调度程序池。 Each "session" consists of the following: 每个“会话”包括以下内容：

generate HTTP PUT #1 to server 生成HTTP PUT＃1到服务器
wait x seconds (randomized within logical limit) 等待x秒（在逻辑限制内随机分配）
generate HTTP PUT #2 to server 生成HTTP PUT＃2到服务器
generate HTTP PUT #3 to server 生成HTTP PUT＃3到服务器
wait y seconds (randomized within logical limit) 等待y秒（在逻辑限制内随机分配）
generate HTTP PUT #4 to server 生成HTTP PUT＃4到服务器
generate HTTP PUT #5 to server 生成HTTP PUT＃5到服务器

approximately 2 minutes are taken up all together per session 每次会议大约需要2分钟

Each thread created by the concurrent pool maintains a connection (an HTTPClient) which is reused by subsequent sessions. 由并发池创建的每个线程都维护一个连接（HTTPClient），该连接可被后续会话重用。 After each request a call to HttpRequest.releaseConnection() is made, as is recommended. 在每个请求之后，建议都调用HttpRequest.releaseConnection（）。

It all works pretty well. 一切都很好。 But perhaps TOO well. 但是也许太好了。

While keeping the connections open and releasing them gives optimum performance, since I am building a simulator I really DON'T want optimal performance. 在保持连接打开并释放它们的同时，可以提供最佳性能， 因为我正在构建模拟器，所以我真的不希望获得最佳性能。 I want to simulate suboptimal performance. 我想模拟次佳的性能。 I want the server to have to go through connection establishment on each session. 我希望服务器必须在每个会话上都通过连接建立。
I want to create the connection (embedded in an HTTP client) at the beginning of each session and close it at the end of the session. 我想在每个会话开始时创建连接（嵌入在HTTP客户端中），并在会话结束时关闭它。

To accomplish this, I simply close the HttpClient at the end of the session and set its variable to null (both in a finally clause). 为此，我只需在会话结束时关闭HttpClient并将其变量设置为null（均在finally子句中）。 When an application session thread starts a new session, if the client is null, it builds a new one using the HttpClientBuilder. 当应用程序会话线程启动新会话时，如果客户端为null，它将使用HttpClientBuilder构建一个新会话。 However, when I do that, I get all sorts of Connection pool errors that foul up the simulation. 但是，当我这样做时，会遇到各种各样的连接池错误，这些错误使模拟变得混乱。

Is there a RIGHT way to make connections suboptimally, as described above, with Apache HttpClient? 如上所述，有没有正确的方法可以使亚最佳连接与Apache HttpClient建立连接？ Kind of a crazy question, but a real one. 有点疯狂的问题，但确实是一个问题。

2 个解决方案

I ran into this problem too, just a few months ago. 几个月前，我也遇到了这个问题。 When you call HttpRequest.releaseConnection() the connection is pooled and doesn't get closed immediately. 当您调用HttpRequest.releaseConnection() ，连接将被池化，并且不会立即关闭。 When you started creating new HttpClients you also stopped making use of these connections pools and starting creating new TCP connections every time that you needed a request and that's what caused the problem. 当您开始创建新的HttpClient时，您每次也需要请求时也停止使用这些连接池，并开始创建新的TCP连接，这就是导致问题的原因。 When if you have multiple threads creating and releasing connections in a short period of time then Java (and even the underlying OS layer) start throwing errors which, based on your comments, is what I assume that it's happening to you. 如果您有多个线程在短时间内创建和释放连接，那么Java（甚至底层的OS层）就会开始抛出错误，根据您的评论，我认为这是发生在您身上的错误。

This is a short list of things that I tried at the time and seemed to reduce the rate of occurrence of this sort of errors: 这是我当时尝试过的一小部分清单，似乎可以减少此类错误的发生率：

Set the TCP connection timeout values to low values (eg 5 secs). 将TCP连接超时值设置为较低的值（例如5秒）。 I was running my code using Java 1.7.0 on OSX and this made things better. 我在OSX上使用Java 1.7.0运行代码，这使事情变得更好。 Later on I stopped using HttpClient and switched to JerseyClient and having low timeouts made a positive difference too. 后来我停止使用HttpClient并切换到JerseyClient，超时时间短也带来了积极的变化。
If you're on OSX/Linux, use the ulimit command to bump up the number of maximum number file descriptors. 如果您使用的是OSX / Linux，请使用ulimit命令增加文件描述符的最大数量。 That doesn't make connections close any faster but seems to let you keep more connections open and therefore delay the problem. 这并不能使连接更快地关闭，但是似乎可以让您保持更多连接的打开状态，从而延迟了问题的发生。
Instead of having a single process with multiple processes (eg 1 process with 500 threads), try spawning multiple child processes with less threads (eg 5 processes with 100 threads each). 而不是让单个进程具有多个进程（例如1个进程具有500个线程），请尝试生成具有更少线程的多个子进程（例如5个进程，每个进程具有100个线程）。 I noticed that killing subprecesses helps with closing connections. 我注意到杀死子程序有助于关闭连接。

At the time I also ended up having to distribute the test agents over multiple hosts and use Grinder to synchronize them. 那时，我还不得不将测试代理分发到多个主机上，并使用Grinder对其进行同步。 Depending on how many concurrent connections you're trying to test your server with, but this might be the best way forward. 取决于您要用来测试服务器的并发连接数，但这可能是最好的方法。

You can toggle re-use of connections within the pool using the builder's setConnectionReuseStrategy . 您可以使用构建器的setConnectionReuseStrategy切换池中连接的重用。 The DefaultConnectionReuseStrategy will re-use connections whenever possible, but the NoConnectionReuseStrategy will close all connections returned to the pool. DefaultConnectionReuseStrategy将在可能的情况下重新使用连接，但是NoConnectionReuseStrategy将关闭返回到池中的所有连接。

I used these connection re-use strategies in reverse: in production no re-use was set (to ensure proper load-balancing - every new connection is directed to a healthy server), but during testing I had to switch back to the default re-use strategy since the test was creating so many connections that the test-machine quickly ran out of ports to use (after a local port is used the OS keeps the port in a waiting/cooldown room, part of the TCP protocol). 我反向使用了这些连接重用策略：在生产中未设置重用（以确保适当的负载平衡-每个新连接都定向到运行正常的服务器），但是在测试期间，我不得不切换回默认重用-use策略，因为测试创建了太多的连接，因此测试机很快耗尽了要使用的端口（使用本地端口后，操作系统将端口保留在等待/冷却室中，这是TCP协议的一部分）。 The good thing is that the test-code only defers from production code for this one setting of the connection re-use strategy. 好处是，对于这种连接重用策略的设置，测试代码仅会延迟生产代码。

Note that the combination of a connection pool and no re-use of connections still has it purpose: the pool will prevent more than it's maximum allowed size of open connections. 请注意，将连接池与不重新使用连接结合起来仍然有其目的：该池将阻止超过打开连接允许的最大大小。 Eg if the application decides it wants to open 100 connections at the same time, and the pool has a maximum size of 30, the pool will let the other requests for the other 70 connections wait until connections are returned. 例如，如果应用程序决定要同时打开100个连接，并且该池的最大大小为30，则该池将让其他70个连接的其他请求等待，直到返回连接。 This is a good way to make clients behave nice and prevent them from overloading the server. 这是使客户端表现良好并防止服务器过载的好方法。