简体   繁体   English

HttpClient,ConnectionManager和怪异的池限制

[英]HttpClient, ConnectionManager and weird pool limits

My WebApp is using an AWS S3 service as a storage for a bunch of files. 我的WebApp使用AWS S3服务作为一堆文件的存储。 When a request comes in, apropriate file is being fetched from S3 with Jets3t library, which uses ASF HttpClient under the hood. 收到请求时,将通过Jets3t库从S3提取适当的文件,该库在后台使用ASF HttpClient。 The problem is that the HttpClient connection manager uses some weird pool concept that enforces setting a limit of maximum connections per route and maximum connections in general. 问题在于HttpClient连接管理器使用一些奇怪的池概念,该概念强制设置每个路由的最大连接数限制和一般情况下的最大连接数。 I'm used to pool adjusting its size to incoming demand, but it doesn't work that way in HttpClient. 我习惯于根据传入的需求来调整其大小,但是在HttpClient中这种方法无法正常工作。 So when pool limit is reached, requests are on hold until connection is free. 因此,当达到池限制时,请求将被保留,直到连接空闲为止。 This somehow silently brings down performance of my WebApp (S3 service is faaar away from saturation). 这以某种方式无声地降低了我的WebApp的性能(S3服务远未达到饱和状态)。

I can't control in any way amount of requests coming to my WebApp, so any effort at coming up with a sane max connection limit is futile. 我无法以任何方式控制向我的WebApp发出的请求数量,因此,为达到合理的最大连接限制所做的任何努力都是徒劳的。 Even when certain value might work ok for the current load, it will fail when some rapid change will come (eg website beeing crawled by search engine). 即使某些值对于当前负载而言可以正常工作,但如果进行一些快速更改(例如,搜索引擎抓取的网站蜂鸣声),它将失败。

So here are my questions: 所以这是我的问题:

  1. Is there any (may be third party) Connection Manager for HttpClient that doesn't enforce such limits? 是否存在不强制执行此类限制的HttpClient连接管理器(可能是第三方)?

  2. If it doesn't exist, can I somehow make Connection Manager report the starvation of connection pool? 如果它不存在,我可以以某种方式使Connection Manager报告连接池的不足吗? If there is no hope at all I'd like to tune up max connection limit every time I'll see some message in logs. 如果根本没有希望,每次我在日志中看到一些消息时,我都想调整最大连接限制。

In case anybody would like to suggest it, I've tried having separate instance of library (and thus HttpClient) per request thread. 万一有人愿意提出建议,我尝试过每个请求线程都具有单独的库实例(因此还有HttpClient)。 It works quite nicely, albeit I guess it consumes more resources. 尽管我猜它消耗了更多的资源,但它的工作效果很好。 I might use that approach if all efforts at overcoming max connection limit will fail. 如果克服最大连接限制的所有努力都失败了,我可能会使用该方法。

org.apache.http.impl.conn.PoolingHttpClientConnectionManager implements the org.apache.http.pool.ConnPoolControl interface, which provides these methods for getting the current connection pool statistics: org.apache.http.impl.conn.PoolingHttpClientConnectionManager实现org.apache.http.pool.ConnPoolControl接口,该接口提供了以下用于获取当前连接池统计信息的方法:

  • PoolStats getTotalStats()
  • PoolStats getStats(final T route)

You can construct the PoolingHttpClientConnectionManager instance passed to the org.apache.http.impl.client.HttpClientBuilder used to build the HttpClient , and therefore you can have a reference to that instance for querying the statistics any time you want. 您可以构造传递给用于构建HttpClientorg.apache.http.impl.client.HttpClientBuilderPoolingHttpClientConnectionManager实例,因此您可以随时引用该实例以查询统计信息。

Optionally, you could create a sublcass of PoolingHttpClientConnectionManager that logs the stats; (可选)您可以创建一个PoolingHttpClientConnectionManager的PoolingHttpClientConnectionManager来记录统计信息。 eg, at requestConnection() time (but maybe only after every nth call). 例如,在requestConnection()时间(但可能仅在第n次调用之后)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM