xperf WinDBG C＃.NET 4.5.2应用程序 - 了解进程转储

Question

Under a heavy load, our application is making a beefy server go to 100% CPU usage. 在负载很重的情况下，我们的应用程序正在使一个强大的服务器达到100％的CPU使用率。 Reading the process dump, looking at the threads, some of them are 10 minutes up. 读取进程转储，查看线程，其中一些是10分钟。 None of them give me any insight when using !CLRStack. 在使用时，他们都没有给我任何见解！CLRStack。

The !runaway is giving me: ！失控给了我：

0:030> !runaway
 User Mode Time
  Thread       Time
  53:2e804      0 days 0:10:04.703
  30:31894      0 days 0:07:51.593
  33:47100      0 days 0:07:24.890
  42:11e54      0 days 0:06:45.875
  35:35e18      0 days 0:06:07.578
  41:54464      0 days 0:05:49.796
  47:57700      0 days 0:05:45.000
  44:3c2d4      0 days 0:05:44.265
  32:3898c      0 days 0:05:43.593
  50:54894      0 days 0:05:41.968
  51:5bc58      0 days 0:05:40.921
  43:14af4      0 days 0:05:40.734
  48:35074      0 days 0:05:40.406
  ...

Calling !DumpStack on one of those threads, I am getting: 在其中一个线程上调用！DumpStack，我得到：

0000001ab442f900 00007ff9ef4c1148 KERNELBASE!WaitForSingleObjectEx+0x94, calling ntdll!NtWaitForSingleObject
0000001ab442f980 00007ff9e920beb2 clr!SVR::gc_heap::compute_new_dynamic_data+0x17b, calling clr!SVR::gc_heap::desired_new_allocation
0000001ab442f9a0 00007ff9e90591eb clr!CLREventWaitHelper2+0x38, calling kernel32!WaitForSingleObjectEx
0000001ab442f9b0 00007ff9e90e0d2c clr!WriteBarrierManager::UpdateEphemeralBounds+0x1c, calling clr!WriteBarrierManager::NeedDifferentWriteBarrier
0000001ab442f9e0 00007ff9e9059197 clr!CLREventWaitHelper+0x1f, calling clr!CLREventWaitHelper2
0000001ab442fa40 00007ff9e9059120 clr!CLREventBase::WaitEx+0x70, calling clr!CLREventWaitHelper
0000001ab442fa70 00007ff9ef4c149c KERNELBASE!SetEvent+0xc, calling ntdll!NtSetEvent
0000001ab442faa0 00007ff9e90ef1e1 clr!SVR::gc_heap::set_gc_done+0x22, calling clr!CLREventBase::Set
0000001ab442fad0 00007ff9e90e9331 clr!SVR::gc_heap::gc_thread_function+0x8a, calling clr!CLREventBase::WaitEx
0000001ab442fb00 00007ff9e92048e7 clr!SVR::gc_heap::gc_thread_stub+0x7a, calling clr!SVR::gc_heap::gc_thread_function
0000001ab442fb60 00007ff9e91a0318 clr!Thread::CLRSetThreadStackGuarantee+0x48, calling kernel32!SetThreadStackGuaranteeStub
0000001ab442fb90 00007ff9e91a01ef clr!Thread::CommitThreadStack+0x10, calling clr!Thread::CLRSetThreadStackGuarantee
0000001ab442fbd0 00007ff9e910df0b clr!ClrFlsSetValue+0x57, calling kernel32!SetLastErrorStub
0000001ab442fc00 00007ff9e92048dc clr!SVR::gc_heap::gc_thread_stub+0x6f, calling clr!_chkstk
0000001ab442fc40 00007ff9f0d316ad kernel32!BaseThreadInitThunk+0xd
0000001ab442fc70 00007ff9f1e54409 ntdll!RtlUserThreadStart+0x1d

What is it telling me? 它告诉我什么？ I see a lot of calls to the CLR, but I can't understand where would the problem be. 我看到很多CLR的调用，但我无法理解问题出在哪里。 After the .reload (suggested by Thomas) now I can see the GC calls. 在.reload（由Thomas建议）之后，我现在可以看到GC调用了。

Update 1 更新1

After running xperf, each w3wp.exe is consuming something about 45% of CPU. 运行xperf后，每个w3wp.exe占用大约45％的CPU。 Filtering by one of them and grouping by Function, there is a function labeled as "?" 按其中一个过滤并按功能分组，有一个标记为“？”的功能。 that is responsible for 13.62%, the others are 2.67% or less. 其中13.62％，其他为2.67％或更低。 How do I manage to know what is this "?"? 我如何知道这是什么“？”？

Update 2 更新2

Ran xperf again, and function JIT_MonEnterWorker_InlineGetThread_GetThread_PatchLabel is responsible for 12.31% of CPU usage. 再次运行xperf，函数JIT_MonEnterWorker_InlineGetThread_GetThread_PatchLabel负责12.31％的CPU使用率。 The "?" “？” function still stays there. 功能仍然存在。

Grouping by Stack: 按堆栈分组：

Line #, Stack, Count, Weight (in view), TimeStamp, % Weight
2,   |- ?!?, 501191, 501222.365294, , 35.51
3,   |    |- clr.dll!JITutil_MonContention, 215749, 215752.552227, , 15.28
4,   |    |- clr.dll!JIT_MonEnterWorker_InlineGetThread_GetThread_PatchLabel, 170804, 170777.100191, , 12.10

As you can see, those two are responsible for more than 27% of CPU usage (for each process, so it is significant). 正如您所看到的，这两者占CPU使用率的27％以上（对于每个进程，因此它很重要）。

Update 3 更新3

After using wpr.exe (suggestion by @magicandre1981): 使用wpr.exe后（@ magicandre1981的建议）：

wpr.exe -start cpu and wpr -stop result.etl

I found out that FormsAuthentication and some unnecessary calls to Ninject on critical path were contributing to around 16% of CPU usage. 我发现FormsAuthentication和关键路径上对Ninject的一些不必要的调用导致大约16％的CPU使用率。 I still don't understand the threads running gor 10 minutes or more. 我仍然不明白运行10分钟或更长时间的线程。

Update 4 更新4

Tried DebugDiag (suggestion from @leppie) and it just confirmed that the threads hanging are all similar to: 尝试DebugDiag（来自@leppie的建议），它刚刚确认悬挂的线程都类似于：

Thread ID: 53     Total CPU Time: 00:09:11.406     Entry Point for Thread: clr!Thread::intermediateThreadProc 
Thread ID: 35     Total CPU Time: 00:07:26.046     Entry Point for Thread: clr!SVR::gc_heap::gc_thread_stub 
Thread ID: 50     Total CPU Time: 00:07:01.515     Entry Point for Thread: clr!SVR::gc_heap::gc_thread_stub 
Thread ID: 29     Total CPU Time: 00:06:02.264     Entry Point for Thread: clr!SVR::gc_heap::gc_thread_stub 
Thread ID: 31     Total CPU Time: 00:06:41.281     Entry Point for Thread: clr!SVR::gc_heap::gc_thread_stub

or due to StackExchange.Redis: 或者由于StackExchange.Redis：

DomainBoundILStubClass.IL_STUB_PInvoke(Int32, IntPtr[], IntPtr[], IntPtr[], TimeValue ByRef)+e1 
[[InlinedCallFrame] (StackExchange.Redis.SocketManager.select)] StackExchange.Redis.SocketManager.select(Int32, IntPtr[], IntPtr[], IntPtr[], TimeValueByRef) 
StackExchange.Redis.SocketManager.ReadImpl()+889 
StackExchange.Redis.SocketManager.Read()+66

or 要么

[[GCFrame]] 
[[HelperMethodFrame_1OBJ] (System.Threading.Monitor.ObjWait)] System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object) 
mscorlib_ni!System.Threading.Monitor.Wait(System.Object, Int32)+19 
StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[[System.__Canon, mscorlib]](StackExchange.Redis.Message, StackExchange.Redis.ResultProcessor`1, StackExchange.Redis.ServerEndPoint)+24f 
StackExchange.Redis.RedisBase.ExecuteSync[[System.__Canon, mscorlib]](StackExchange.Redis.Message, StackExchange.Redis.ResultProcessor`1, StackExchange.Redis.ServerEndPoint)+77 
[[StubHelperFrame]] 
StackExchange.Redis.RedisDatabase.SetMembers(StackExchange.Redis.RedisKey, StackExchange.Redis.CommandFlags)+ee

Answer 1

Doing this by hand needs bravery ;) Please check this Official MS DebugDiag 2.2: https://www.microsoft.com/en-us/download/details.aspx?id=49924 it has come with analyzer so you don't have to do with your hand. 手工完成需要勇敢;）请检查这个官方MS DebugDiag 2.2： https ： //www.microsoft.com/en-us/download/details.aspx？ id = 49924它已经配备了分析仪，所以你没有用你的手做。 With DebugDiag , I think you will find your problem faster then ever ... 有了DebugDiag ，我想你会发现你的问题比以往更快 ......

Answer 2

The slow app , could be from the slow code Or maybe it happens from the .NET engine 缓慢的应用程序可能来自慢速代码或者它可能来自.NET引擎

at first if you had checked the clr.dll if it have problems you can download it and replace it on your computer Else if that it don't have any problem Try this 首先，如果您检查了clr.dll，如果它有问题，您可以下载并在您的计算机上更换它如果它没有任何问题试试这个

I think you should review your application codes ,and chick every corner that takes a lot of process and try to balance the code operations load between CPU and RAM . 我认为你应该检查你的应用程序代码，并在需要大量进程的每个角落，并尝试平衡CPU和RAM之间的代码操作负载。 loops , object initialization Or recursion functions etc.. all makes load on CPU Try to store the shard objects on static Or Constant 循环，对象初始化或递归函数等。所有在CPU上加载尝试将碎片对象存储在静态或常量上

xperf WinDBG C＃.NET 4.5.2应用程序 - 了解进程转储

问题描述

2 个解决方案

解决方案1
2 2016-01-24 21:39:46

解决方案2
-1 2016-01-13 09:39:49

xperf WinDBG C＃.NET 4.5.2应用程序 - 了解进程转储

问题描述

2 个解决方案

解决方案1 2 2016-01-24 21:39:46

解决方案2 -1 2016-01-13 09:39:49

解决方案1
2 2016-01-24 21:39:46

解决方案2
-1 2016-01-13 09:39:49