简体   繁体   English

Apache httpclient 4.3.3 - 速度调整

[英]Apache httpclient 4.3.3 - speed tuning

What I am trying to do is to collect a million urls on gigabit connection and the speed varies between 5MB/s and 12MB/s (Megabytes per second) which is much under the bandwidth maximum.我想要做的是在千兆连接上收集一百万个 url,速度在 5MB/s 和 12MB/s(每秒兆字节)之间变化,这远低于最大带宽。 The code I use:我使用的代码:

    DnsResolver dnsResolver = new SystemDefaultDnsResolver();
    X509HostnameVerifier hostnameVerifier = new AllowAllHostnameVerifier();
    SSLContext sslcontext = SSLContexts.createSystemDefault();
    RedirectStrategy redirectStrategy = new LaxRedirectStrategy();

    HttpConnectionFactory<HttpRoute, ManagedHttpClientConnection> connFactory= = new ManagedHttpClientConnectionFactory(
                    new DefaultHttpRequestWriterFactory(),
                   new DefaultHttpResponseParserFactory());

    Registry<ConnectionSocketFactory> socketFactoryRegistry = RegistryBuilder
                        .<ConnectionSocketFactory> create()
                        .register(
                                "https",
                                new SSLConnectionSocketFactory(sslcontext,
                                        hostnameVerifier))
                        .register("http", new PlainConnectionSocketFactory())
                        .build();
    SocketConfig socketConfig = SocketConfig.custom().setSoKeepAlive(false)
                    .setSoReuseAddress(false)
                    .setSoTimeout(15000).build();
    PoolingHttpClientConnectionManager manager = new PoolingHttpClientConnectionManager(socketFactoryRegistry,connFactory, dnsResolver);
     manager.setDefaultSocketConfig(socketConfig);
     manager.setMaxTotal(1000);
    CloseableHttpClient httpClient = HttpClientBuilder.create().setUserAgent("Mozilla")
                    .setConnectionManager(manager)
                    .setRedirectStrategy(redirectStrategy)               
                    .setMaxConnPerRoute(-1).build();

    RequestConfig defaultConfig = RequestConfig.custom()
                    .setCookieSpec(CookieSpecs.IGNORE_COOKIES)
                    .setExpectContinueEnabled(false)
                    .setStaleConnectionCheckEnabled(false)
                    .setRedirectsEnabled(true)
                    .setStaleConnectionCheckEnabled(false)
                    .setMaxRedirects(5).build();

    RequestConfig rConfig= RequestConfig.copy(defaultConfig)
                    .setSocketTimeout(15000)
                    .setConnectionRequestTimeout(-1)
                    .setConnectTimeout(15000).build();

ExecutorService  executorService = Executors.newFixedThreadPool(640);

FutureRequestExecutionService service = new FutureRequestExecutionService(httpClient, executorService);

Per request configuration is:每个请求配置是:

 HttpGet httpget = new HttpGet("some url");
    httpget.setConfig(rConfig);
    httpget.setHeader("Connection", "close");

In ResponseHandler I use the following code to consume the content:在 ResponseHandler 中,我使用以下代码来使用内容:

 stream = response.getEntity().getContent();
    final byte[] content = IOUtils.toByteArray(stream);

Each url is from different domain.每个 url 来自不同的域。 The machine is with 8 cores and 8GB of RAM - 64 bit linux - Debian.该机器具有 8 个内核和 8GB 的​​ RAM - 64 位 linux - Debian。 How to speed up this ?如何加快速度?

If you do not need automatic authentication, retries, cookie management and do not mind handling redirects manually, consider using minimal HttpClient implementation.如果您不需要自动身份验证、重试、cookie 管理并且不介意手动处理重定向,请考虑使用最少的 HttpClient 实现。 Minimal HCs are built with a minimal execution pipeline consisting of mandatory protocol interceptors only and should have the best performance characteristics with the same concurrency parameters (connection pool setup).最小的 HC 是用最小的执行管道构建的,仅由强制协议拦截器组成,并且应该具有相同的并发参数(连接池设置)的最佳性能特征。

PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
CloseableHttpClient hc = HttpClients.createMinimal(cm);

And naturally you should be wanting to re-use connection for optimal performance.当然,您应该希望重用连接以获得最佳性能。 This seems to go counter to what I would consider best practices.这似乎与我认为的最佳实践背道而驰。

httpget.setHeader("Connection", "close"); // Huh?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Apache HttpClient 4.3.3如何查找所请求站点的目标IP - Apache HttpClient 4.3.3 how to find target IP of the requested site Apache HTTPClient 4.3.3 GET 请求的执行方法阻塞并且永不返回 - Apache HTTPClient 4.3.3 execute method for a GET request blocks and never returns Apache httpclient 4.3.3如何只接受一个特定的自签名证书 - Apache httpclient 4.3.3 how do I accept only one specific self signed certificate 使用httpClient 4.3.3 for Android的InterruptedIOException - InterruptedIOException using httpClient 4.3.3 for Android Java多线程HttpClient-4.3.3问题 - Java Multithreading HttpClient-4.3.3 problems 如何在 Apache HttpClient 5 中使用 Conscrypt 来加速 TLS - How to use Conscrypt with Apache HttpClient 5 to speed up TLS 从HTTPClient 3.1迁移到4.3.3,Method.getResponseBody(int) - Migration from HTTPClient 3.1 to 4.3.3, Method.getResponseBody(int) NoSuchFieldError-是否可以在同一应用程序中同时包含httpclient-4.3.3 jar和httpclient-4.2.5 jar - NoSuchFieldError - Is it possible to include both httpclient-4.3.3 jar and httpclient-4.2.5 jar in the same application 提高Android的HTTPClient连接速度 - Increase Speed of HTTPClient Connections Android 使用 TLS 的 Apache HTTPClient - Apache HTTPClient using TLS
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM