简体   繁体   中英

Unexplained thread creation and handle count increase in a WCF service

We have several WCF services hosted on IIS on a Server 2008 R2 Amazon EC2 instance with 32 cores. We are using .NET Framework Version 4.5.2. The issue at hand is unexplained increases in handle counts - some of our services accrue hundreds of thousands of open handles after being active for greater than one day (when I force garbage collection using a 3rd party tool, the handle counts dropped to around 2k). In investigating this, I created a simple service with no functionality, and started it under IIS. No client requests were being made to this service. In one hour, there were 20k+ handles open under under the service's process. Looking at the service's process with procmon, I could see bursts of 20+ thread exits and then thread creates, every 40 seconds or so. I then switched the service's application pool from .NET Framework Version v4.0 to v2.0 and started the service again; the handle count didn't move from approximately 500 open handles for the entire hour. I was not able to reproduce this issue on several of my machines (not at Amazon). I'm aware that there were significant thread pool changes in CLR 4.0 - http://msdn.microsoft.com/en-us/magazine/ff960958.aspx , but I do not know why I'm seeing 1) bursts of thread creation activity with no client requests or work being performed by the service and 2) why the thread handles and associated event handles are not being released.

I recently ran into this issue with a .NET service (hosted in IIS on Server 2012 R2 with .NET 4.5.1). While sitting idle, it would accumulate >30,000 handles. Using !htrace in WinDbg I could see the handles all being created in this stack:

Call Site
clr!Thread::CreateNewOSThread+0x7f
clr!Thread::CreateNewThread+0x90
clr!ThreadpoolMgr::CreateUnimpersonatedThread+0xc7
clr!ThreadpoolMgr::MaybeAddWorkingWorker+0x113
clr!ManagedPerAppDomainTPCount::SetAppDomainRequestsActive+0x24
clr!ThreadpoolMgr::SetAppDomainRequestsActive+0x2a
clr!ThreadPoolNative::RequestWorkerThread+0x2b
mscorlib_ni!System.Threading.ThreadPoolWorkQueue.Dispatch()
mscorlib_ni![ContextTransitionFrame: 0000002b15e4ef28] 
clr!CallDescrWorkerInternal+0x83
clr!CallDescrWorkerWithHandler+0x4a
clr!MethodDescCallSite::CallTargetWorker+0x380
clr!QueueUserWorkItemManagedCallback+0x2a
clr!ManagedThreadBase_DispatchInner+0x2d
clr!ManagedThreadBase_DispatchMiddle+0x6c
clr!ManagedThreadBase_DispatchOuter+0x75
clr!ManagedThreadBase_DispatchInCorrectAD+0x15
clr!Thread::DoADCallBack+0x25b
clr!ManagedThreadBase_DispatchInner+0x69
clr!ManagedThreadBase_DispatchMiddle+0x6c
clr!ManagedThreadBase_DispatchOuter+0x75
clr!ManagedThreadBase_FullTransitionWithAD+0x2f
clr!ManagedPerAppDomainTPCount::DispatchWorkItem+0xe3
clr!ThreadpoolMgr::ExecuteWorkRequest+0x64
clr!ThreadpoolMgr::WorkerThreadStart+0x2b6
clr!Thread::intermediateThreadProc+0x7d
KERNEL32!BaseThreadInitThunk+0xd
ntdll!RtlUserThreadStart+0x1d

Each call to CreateNewOSThread would create 1 Thread handle and 4 Event handles, which were not being cleaned up (the thread would finish running but the handles would stick around). I never tracked down what was adding tasks to the thread pool, but what I did notice was that since the service was "idle" the GC was never being called.

For some reason the thread pool manager doesn't dispose the handles when the worker thread is allowed to exit, and instead relies on the garbage collector to do it.

As a test, I added a method to manually invoke the garbage collector on the service. After observing the linear increase in handles, I kicked a GC on the service and watched the handle counts drop back down to normal levels.

In a w3wp.exe instance hosting one WCF service, there exist at least 3 AppDomains in .NET 4, one is called SharedAppDomains which include over 20 .net framework assemblies, and the other called Default, and the last one is named something like /LM/W3Svc.... some funky name, which contains your WCF application assemblies as well as some direct dependencies. What tool told you that there's only one app domain not containing other assemblies? The simplest way is to run Process Explorer as Administrator and check the .NET Assemblies of the w3wp.exe instance.

Nevertheless, even if your WCF is running idle without responding to coming requests, w3wp.exe is not running idle since it is a hosting process responsible for many house keeping tasks. In my Hello World WCF service on .NET 4.5.1 on .net 4 app pool of IIS 7 on Windows 7, w3wp.exe has the thread count jumping between 44-47. Memory usage as well as other resource figures are basically stable.

You mentioned the problem only occurred on AWS machines, not your other machines. So you have better to find out all loaded app domains and their assemblies through running Process Explorer as Administrator , and compare the list of the w3wp.exe instance on your own PC, and single out some usual suspects that could do more works than expected. Of course, it could be that w3wp.exe got compromised and doing doggy things, however, at this stage, just check assemblies and app domains first. This is not an answer, however, the comment area of SO has restriction on the length of comment. So hopefully this is somewhere to start to check things.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM