简体   繁体   English

异步HttpClient请求变慢

[英]async HttpClient requests slowing down

I have list of 10 000 000 urls in text file. 我在文本文件中有1 000万个URL列表。 Now I open every of them in my await/async method - at the beging the speed is very good (near 10 000 urls / min) but while the program is running it's decreasing to reach 500 urls / min after ~10 hours. 现在,我以await / async方法打开它们中的每一个-在开始时,速度非常好(接近10000 urls / min),但是在程序运行时,它在约10小时后逐渐降低到500 urls / min。 When I restart the program and run from begging the situation is the same - fast at beggining and then slower and slower. 当我重新启动程序并从乞讨开始运行时,情况是一样的-乞讨开始很快,然后越来越慢。 I'm working on Windows Server 2008 R2. 我正在使用Windows Server 2008 R2。 Tested my code at various PC - some results. 在各种PC上测试了我的代码-一些结果。 Can You tell me where is the problem? 你能告诉我问题出在哪里吗?

 int finishedUrls = 0;
 IEnumerable<string> urls = File.ReadLines("urlslist.txt");
 await urls.ForEachAsync(500, async url =>
    {                        
        Uri newUri;
        if (!Uri.TryCreate(siteUrl, UriKind.Absolute, out newUri)) return false;
        _uri = newUri;
        var timeout = new CancellationTokenSource(TimeSpan.FromSeconds(30));
        string html = "";
        using(var _httpClient = new HttpClient { Timeout = TimeSpan.FromSeconds(30),MaxResponseContentBufferSize = 300000 }) {
            using(var _req = new HttpRequestMessage(HttpMethod.Get, _uri)){
                using( var _response = await _httpClient.SendAsync(_req,HttpCompletionOption.ResponseContentRead,timeout.Token).ConfigureAwait(false)) {

                        if (_response != null &&
                            (_response.StatusCode == HttpStatusCode.OK || _response.StatusCode == HttpStatusCode.NotFound))
                        {
                            using (var cancel = timeout.Token.Register(_response.Dispose))
                            {
                                var rawResponse = await _response.Content.ReadAsByteArrayAsync().ConfigureAwait(false);
                                html = Encoding.UTF8.GetString(rawResponse);
                            }
                        }
                }
            }
        }
        Interlocked.Increment(ref finishedUrls);
    });

http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx

I believe you are exhausting your io completion ports. 我相信您正在用尽io完成端口。 You need to throttle your requests. 您需要限制您的请求。 If you need higher concurrency than a single box can handle, then distribute your concurrent requests across more machines. 如果您需要更高的并发性,而不是单个盒子可以处理的并发性,那么可以将并发请求分布在更多计算机上。 I'd suggest using TPL more managing the conncurrency. 我建议使用TPL来更多地管理并发。 I ran into this exact same behavior doing similar things. 我在做类似事情时遇到了完全相同的行为。 Also, you should absolutely not be disposing your HttpClient per request. 另外,绝对应按请求处置HttpClient。 Pull that code out and use a single client. 拉出该代码并使用一个客户端。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM