简体   繁体   中英

How to debug possible timeout/performance issue?

I have a site that is deployed to two load-balanced servers. On only one of the servers, I will occasionally see exceptions that seem to be timeout or performance related. They happen on a variety of pages, in a variety of function calls. Example exceptions include:

System.Net.WebException: The request was aborted: The request was canceled.

System.Net.WebException: The operation has timed out

System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host

System.Web.HttpException: Request timed out.

When I have the information from the stack trace, it looks like they are related to calls to a web service hosted on the same set of load balanced servers. A page may throw an exception on one request, and then not on another request. I'm already using ELMAH on the site, which is how I know the errors are taking place.

I don't really know how to begin debugging this. I don't have direct access to the production servers--any requests for information need to go through the client, and need to be fairly specific. Any suggestions?

Edit: There are other sites on these two servers which use the same web services and have not shown any problems.

First I will identify the pattern for these timeout exception. They may all be related to one issue. Find out the web methods that is timing out and nail it down to which server that serviced the request.

Other factors are network congestion and server CPU resources (both web and DB) at the time of timeout to identify any other potential bottlenects.

If the same site is up and running without errors in the other server, "debugging" the programmers way is not going to be useful. You need to think more as a SysAdmin, and probably work jointly with one.

You'll need first what is different between the two servers, and that may include:

  • Hardware configuration and capacity
  • OS and Base Systems Configuration and versions.
  • Other Software running concurrently.
  • The load balancer configuration and/or working mode
  • Some asymmetry in the network from the two servers viewpoint

While you review that, be sure the production people is taking (minimum):

  • CPU load measurements
  • I/O load measurements
  • DB, Web Server and Other basic components monitor
  • System and Apps log (and reviewing them)

Hopefully with this done, you'll be able to correlate the timing-out events with another system characteristic.

HTH!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM