简体   繁体   English

多线程WebClient请求返回错误-System.Net.WebException

[英]Multithreaded WebClient requests return error - System.Net.WebException

I have 5000+ pages I want to download using WebClient . 我要使用WebClient下载5000多个页面。 Since I want that done as fast as possible I am trying to use multithreading (using BlockingCollection in my case), but the program always seems to be crashing after a while with error - "System.Net.WebException". 由于我希望尽快完成该任务,因此尝试使用多线程(在我的情况下使用BlockingCollection ),但是该程序似乎总是在一段时间后崩溃,并显示错误-“ System.Net.WebException”。 If I add some Thread.Sleep(3000) delay it slows down my download process and it returns the error after a little more time. 如果我添加一些Thread.Sleep(3000)延迟,它会减慢我的下载过程,并且在多一点时间后返回错误。

It usually takes about 2-3 seconds to download one page. 下载一页通常需要2-3秒。

Normally, I would guess that there is a problem with my BlockingCollection , but it works fine with other tasks, so I am pretty sure that something has to be wrong with my WebClient requests. 通常,我会认为我的BlockingCollection存在问题,但是它可以与其他任务一起正常工作,因此,我很确定WebClient请求肯定有问题。 I think there might be some kind of overlapping between the separate WebClients , but that's just guessing. 我认为各个WebClients之间可能存在某种重叠,但这只是猜测。

        Multithreading multiThread = new Multithreading(5); 
        for(int pageNumber = 1; pageNumber <= 5181; pageNumber++)
        {
            multiThread.EnqueueTask(new Action(() => //add task ("scrape the trader") to the multithread queue
            {
                using (WebClient client = new WebClient())
                {
                    client.DownloadFile("http://example.com/page=" + pageNumber.ToString(), @"C:\mypages\page " + pageNumber.ToString() + ".html");
                } 
            }));
            //I put the Thread.Sleep(123) delay here
        }

If I add a smaller delay ( Thread.Sleep(100) for example) it works fine, but then I end up scraping Page # *whatever pageNumber's value is at the moment* , not in order as it usually does. 如果我添加一个较小的延迟(例如Thread.Sleep(100) ),它可以正常工作,但是最终我会抓取Page # *whatever pageNumber's value is at the moment* ,而不是通常的顺序。

Here is my BlockingCollection (I think I got this code from stackoverflow): 这是我的BlockingCollection (我想我是从stackoverflow获得此代码的):

class Multithreading : IDisposable
{
      BlockingCollection<Action> _taskQ = new BlockingCollection<Action>();

      public Multithreading(int workerCount)
      {
        // Create and start a separate Task for each consumer:
        for (int i = 0; i < workerCount; i++)
          Task.Factory.StartNew (Consume);
      }

      public void Dispose() { _taskQ.CompleteAdding(); }

      public void EnqueueTask (Action action) { _taskQ.Add (action); }

      void Consume()
      {
        // This sequence that we’re enumerating will block when no elements
        // are available and will end when CompleteAdding is called. 
        foreach (Action action in _taskQ.GetConsumingEnumerable())
          action();     // Perform task.
      }
}

I also tried putting everything into endless while loop and handling the error using try...catch statements, but apparently it does not return the error immediately, but after a while (not sure when). 我还尝试将所有内容放入无尽的while循环中,并使用try...catch语句处理错误,但是显然它不会立即返回错误,而是会在一段时间后(不确定何时)返回错误。

Here is the whole exception: 这是整个异常:

An exception of type 'System.Net.WebException' occurred in System.dll but was not handled in user code

Additional information: An exception occurred during a WebClient request.

The class is not guaranteed to be thread safe. 该类不能保证是线程安全的。 from MSDN: 从MSDN:

Any instance members are not guaranteed to be thread safe 不保证任何实例成员都是线程安全的

Update 更新资料

Use one HttpWebRequest for each request that you make. 对您发出的每个请求使用一个HttpWebRequest If you make a lot of requests to different web sites it doesn't matter if you use WebClient or HttpWebRequest . 如果您向不同的网站发出大量请求,则使用WebClient还是HttpWebRequest都没有关系。

If you do a lot of requests to the same web site it is still not as inefficient as it seems. 如果您对同一网站进行大量请求,它的效率仍然没有看上去的低。 HttpWebRequest reuse connections (it's hidden underneath the hood). HttpWebRequest重用连接(隐藏在引擎盖下)。 Microsoft uses something called service points and you can access them through the HttpWebRequest.ServicePoint property. Microsoft使用一种称为服务点的名称,您可以通过HttpWebRequest.ServicePoint属性访问它们。 If you click on the property definition you come to the ServicePoint documentation where you can fine tune the number of connections per web site etc. 如果单击属性定义,则会出现ServicePoint 文档 ,您可以在其中微调每个网站的连接数等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 WebClient SSL 错误:“System.Net.WebException:无法建立 SLL 连接” - WebClient SSL Error: "System.Net.WebException: The SLL connection could not be established" System.Net.WebException:远程服务器返回错误:(429)请求太多 - System.Net.WebException: The remote server returned an error: (429) Too Many Requests 尝试上传到FTP:System.Net.WebException:系统错误 - Attempting to upload to FTP: System.Net.WebException: System error 如何从System.Net.WebException获取ftp错误 - How to get ftp error from System.Net.WebException System.Net.WebException:远程服务器返回错误:(403)禁止 - System.Net.WebException: The remote server returned an error: (403) Forbidden System.Net.WebException-内部服务器错误500 - System.Net.WebException - Internal Server error 500 System.Net.WebException:远程服务器返回错误:(403)禁止 - System.Net.WebException : The remote server returned an error: (403) Forbidden System.Net.WebException:'错误:ConnectFailure(连接被拒绝)' - System.Net.WebException: 'Error: ConnectFailure (Connection refused)' Windows Phone 8连接检查WebClient抛出可怕的System.Net.WebException - Windows Phone 8 connection check WebClient throws dreaded System.Net.WebException 使用 WebClient 时出现 System.Net.WebException:无法创建 SSL/TLS 安全通道 - System.Net.WebException when using WebClient: Can not create SSL/TLS secure channel
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM