简体   繁体   中英

I am getting WebException: “The operation has timed out” immediately on HttpWebRequest.GetResponse()

I am scraping the content of web pages heavily in a multi-thread environment. I need a reliable downloader component that is tolerable to temporary server failures, connection drops, etc. Below is what my code looks like.

Now, I am having a weird situation over and over: It all starts perfectly. 10 threads pull data concurrently for about 10 minutes. After that time I start getting WebException with timeouts right after I call the GetResponse method of my request object . Taking a break (getting a thread to sleep) doesn't help. It only helps when I stop the application and start it over until the next 10 minutes pass and the problem comes back again.

What I tried already and nothing has helped:

  • to close/dispose the response object explicitly and via the "using" statement
  • to call request.Abort everywhere it could have helped
  • to manipulate timeouts at ServicePointManager/ServicePoint and WebRequest level (extend / shorten the timeout interval)
  • to manipulate the KeepAlive property
  • to call to CloseConnectionGroup
  • to manipulate the number the threads that run simultaneously

Nothing helps! So it seems like it's a bug or at least very poorly documented behavior. I've seen a lot of question regarding this in Google and on Stackoverflow, but non of them is fully answered. Basically people suggest one of the things from the list above. I tried all of them.

    public TResource DownloadResource(Uri uri)
    {
        for (var resourceReadingAttempt = 0; resourceReadingAttempt <= MaxTries; resourceReadingAttempt++)
        {
            var request = (HttpWebRequest)WebRequest.Create(uri);
            HttpWebResponse response = null;
            for (var downloadAttempt = 0; downloadAttempt <= MaxTries; downloadAttempt++)
            {
                if (downloadAttempt > 0)
                {
                    var sleepFor = TimeSpan.FromSeconds(4 << downloadAttempt) + TimeSpan.FromMilliseconds(new Random(DateTime.Now.Millisecond).Next(1000));
                    Trace.WriteLine("Retry #" + downloadAttempt + " in " + sleepFor + ".");
                    Thread.Sleep(sleepFor);
                }
                Trace.WriteLine("Trying to get a resource by URL: " + uri);

                var watch = Stopwatch.StartNew();
                try
                {
                    response = (HttpWebResponse)request.GetResponse();
                    break;
                }
                catch (WebException exception)
                {
                    request.Abort();
                    Trace.WriteLine("Failed to get a resource by the URL: " + uri + " after " + watch.Elapsed + ". " + exception.Message);
                    if (exception.Status == WebExceptionStatus.Timeout)
                    {
                        //Trace.WriteLine("Closing " + request.ServicePoint.CurrentConnections + " current connections.");
                        //request.ServicePoint.CloseConnectionGroup(request.ConnectionGroupName);
                        //request.Abort();
                        continue;
                    }
                    else
                    {
                        using (var failure = exception.Response as HttpWebResponse)
                        {

                            Int32 code;
                            try { code = failure != null ? (Int32)failure.StatusCode : 500; }
                            catch { code = 500; }

                            if (code >= 500 && code < 600)
                            {
                                if (failure != null) failure.Close();
                                continue;
                            }
                            else
                            {
                                Trace.TraceError(exception.ToString());
                                throw;
                            }
                        }
                    }
                }
            }

            if (response == null) throw new ApplicationException("Unable to get a resource from URL \"" + uri + "\".");
            try
            {
                // response disposal is required to eliminate problems with timeouts
                // more about the problem: http://stackoverflow.com/questions/5827030/httpwebrequest-times-out-on-second-call
                // http://social.msdn.microsoft.com/Forums/en/netfxnetcom/thread/a2014f3d-122b-4cd6-a886-d619d7e3140e

                TResource resource;
                using (var stream = response.GetResponseStream())
                {
                    try
                    {
                        resource = this.reader.ReadFromStream(stream);
                    }
                    catch (IOException exception)
                    {
                        Trace.TraceError("Unable to read the resource stream: " + exception.ToString());
                        continue;
                    }
                }
                return resource;
            }
            finally
            {
                // recycle as much as you can
                if (response != null)
                {
                    response.Close();
                    (response as IDisposable).Dispose();
                    response = null;
                }
                if (request != null)
                {
                    //Trace.WriteLine("closing connection group: " + request.ConnectionGroupName);
                    //request.ServicePoint.CloseConnectionGroup(request.ConnectionGroupName);
                    request.Abort();
                    request = null;
                }
            }
        }
        throw new ApplicationException("Resource was not able to be acquired after several attempts.");
    }

i have same problem i have search a lot on internet , i got 1 solution, fix the number of thread at a time .you have to control the number of thread at a time, i have started to use 2-3 thread at a time. also use this ServicePointManager.DefaultConnectionLimit = 200; this will really help you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM