The web API hosted in IIS has unexpected behavior after one of the releases:
These are the metrics collected using the IIS Web Service(_Total) Current Connections
counter:
A connection is a active session on your server. When someone connects, it increments the counter, when they disconnect, it goes down.
At what point in time everything works stably, but then the number of connections exceeds the 5,000+- connection treshhold and the API gives a 503.2
error.
From API:
The serverRuntime@appConcurrentRequestLimit setting is being exceeded.
From event logs:
A worker process '5352' serving application pool '%placeholder%' failed to stop a listener channel for protocol 'http' in the allotted time. The data field contains the error number.
Restarting solves this problem, but it is not a long-term solution.
I have full dump of application pool, but I not found any problems: deadlocks or something like that.I do not have much experience with, so I used command such as: !crlstack
, !dupmstack
, !dkl
, !runaway
, !threadpool
and so on for analysis.
There are not a lot of busy workers or completition threads, everything looks goodб even CPU utilization and corresponding metrics. I also tried to use the Debug Diagnostic Tool
for analysis, but to no avail, it showed almost the same thing that I already saw when I used WinDbg
.
What could cause the problem? How to compare the symptoms? What tools to use or maybe try to perform some separate manipulations with the already mentioned ones? I understand that the problem is in the application, but I can not diagnose it.
I am confused and do not know the direction of further analysis. If my question is offtopic for this site I can move it.
UPD .
Especially if you are using HttpClient. check this one here. You might even be exhausting ports.
I know such a problem, but here is another case. The component with which the problem occurs is the last in the call chain:
Client <-> API <-> DB
From your version control, find the changed calls to limit the investigation area and find where the new problem stems from.
Already, but unfortunately unsuccessfully. I can simply roll back task by task, but I believe that this is not a good solution, because I need to have an idea about the diagnosis of such problems.
List of managed threads:
0:000> !runaway
User Mode Time
Thread Time
48:ea8 0 days 0:00:25.187
50:18ec 0 days 0:00:23.171
49:1b1c 0 days 0:00:22.593
52:5c4 0 days 0:00:22.562
51:1bd8 0 days 0:00:22.312
45:109c 0 days 0:00:22.187
47:152c 0 days 0:00:22.078
46:1988 0 days 0:00:20.859
56:2b8 0 days 0:00:17.078
26:1f40 0 days 0:00:16.281
24:140c 0 days 0:00:16.265
27:17c8 0 days 0:00:16.187
22:181c 0 days 0:00:16.109
25:1f88 0 days 0:00:16.031
23:9fc 0 days 0:00:15.968
20:10ec 0 days 0:00:15.765
21:1f74 0 days 0:00:15.750
57:ff8 0 days 0:00:12.390
9:1ef4 0 days 0:00:04.734
59:1b8c 0 days 0:00:04.375
7:18b4 0 days 0:00:04.187
6:1798 0 days 0:00:04.000
4:1ac0 0 days 0:00:03.671
10:13d4 0 days 0:00:03.484
8:1f70 0 days 0:00:03.203
55:434 0 days 0:00:03.171
60:1e34 0 days 0:00:03.031
5:1f0 0 days 0:00:02.468
44:16d0 0 days 0:00:02.203
61:bd4 0 days 0:00:02.156
40:1c34 0 days 0:00:02.031
43:177c 0 days 0:00:02.000
38:1b5c 0 days 0:00:01.890
36:2210 0 days 0:00:01.890
3:2264 0 days 0:00:01.796
39:1e5c 0 days 0:00:01.765
34:1ea8 0 days 0:00:01.734
62:16ac 0 days 0:00:01.718
37:10a4 0 days 0:00:01.609
42:2028 0 days 0:00:01.593
35:10b4 0 days 0:00:01.515
41:187c 0 days 0:00:01.453
64:1764 0 days 0:00:00.703
65:124c 0 days 0:00:00.593
63:13a0 0 days 0:00:00.453
58:1a9c 0 days 0:00:00.421
70:14dc 0 days 0:00:00.406
54:8d8 0 days 0:00:00.390
69:704 0 days 0:00:00.265
66:1d0c 0 days 0:00:00.156
28:2120 0 days 0:00:00.140
72:18c4 0 days 0:00:00.109
73:b40 0 days 0:00:00.015
0:1330 0 days 0:00:00.015
77:50c 0 days 0:00:00.000
76:1840 0 days 0:00:00.000
75:1614 0 days 0:00:00.000
74:1c1c 0 days 0:00:00.000
71:824 0 days 0:00:00.000
68:18e8 0 days 0:00:00.000
67:1518 0 days 0:00:00.000
53:1ed4 0 days 0:00:00.000
33:1838 0 days 0:00:00.000
32:1e6c 0 days 0:00:00.000
31:1a40 0 days 0:00:00.000
30:608 0 days 0:00:00.000
29:2e0 0 days 0:00:00.000
19:176c 0 days 0:00:00.000
18:1fa0 0 days 0:00:00.000
17:1394 0 days 0:00:00.000
16:14f0 0 days 0:00:00.000
15:13cc 0 days 0:00:00.000
14:5d0 0 days 0:00:00.000
13:944 0 days 0:00:00.000
12:4f8 0 days 0:00:00.000
11:2360 0 days 0:00:00.000
2:310 0 days 0:00:00.000
1:fe8 0 days 0:00:00.000
Output of !threadpool
:
0:000> !threadpool
CPU utilization: 14%
Worker Thread: Total: 11 Running: 0 Idle: 9 MaxLimit: 32767 MinLimit: 8
Work Request in Queue: 0
--------------------------------------
Number of Timers: 2
--------------------------------------
Completion Port Thread:Total: 7 Free: 7 MaxFree: 16 CurrentLimit: 7 MaxLimit: 1000 MinLimit: 8
All of them have Http Status: 200 (NULL). Not completed
Http Status: 200 (NULL). Not completed
. Unfortunately, there are no records associated with any threads.
0:000> !whttp /running
HttpContext Thread Time Out Running Status Verb
0000029e1953fb58 -- 00:01:50 00:14:22 200 GET
0000029e19ce1b58 -- 00:01:50 00:18:38 200 GET
0000029e19dabc00 -- 00:01:50 00:16:08 200 GET
0000029e19db28c8 -- 00:01:50 00:14:15 200 GET
0000029e19db9898 -- 00:01:50 00:10:51 200 GET
0000029e19dc9cf8 -- 00:01:50 00:18:38 200 GET
0000029e19de4188 -- 00:01:50 00:10:51 200 GET
0000029e19e48350 -- 00:01:50 00:10:45 200 GET
0000029e19ea3428 -- 00:01:50 00:18:05 200 GET
0000029e19eaab88 -- 00:01:50 00:10:45 200 GET
0000029e19ec91a0 -- 00:01:50 00:10:44 200 GET
0000029e19f74e30 -- 00:01:50 00:10:39 200 GET
0000029e19fa8ca8 -- 00:01:50 00:10:39 200 GET
0000029e19fe56d8 -- 00:01:50 00:10:50 200 GET
0000029e19ffa778 -- 00:01:50 00:14:21 200 GET
0000029e1a0b6088 -- 00:01:50 00:10:38 200 GET
0000029e1a12b040 -- 00:01:50 00:10:38 200 GET
0000029e1a16cd50 -- 00:01:50 00:10:37 200 GET
0000029e1a2b22e0 -- 00:01:50 00:19:18 200 GET
0000029e1a2cf618 -- 00:01:50 00:19:27 200 GET
0000029e1a2f3620 -- 00:01:50 00:19:18 200 GET
0000029e1a2ff808 -- 00:01:50 00:19:18 200 GET
0000029e1a30aa20 -- 00:01:50 00:19:22 200 GET
0000029e1a314b98 -- 00:01:50 00:19:22 200 GET
0000029e1a3352a0 -- 00:01:50 00:19:17 200 GET
0000029e1a34e6f8 -- 00:01:50 00:18:38 200 GET
0000029e1a353248 -- 00:01:50 00:19:10 200 GET
0000029e1a371260 -- 00:01:50 00:19:20 200 GET
0000029e1a39f800 -- 00:01:50 00:18:37 200 GET
0000029e1a3b32e8 -- 00:01:50 00:18:36 200 GET
0000029e1a3d18b8 -- 00:01:50 00:18:03 200 GET
0000029e1a3d6f40 -- 00:01:50 00:18:22 200 GET
0000029e1a3e2670 -- 00:01:50 00:19:14 200 GET
0000029e1a404510 -- 00:01:50 00:18:53 200 GET
0000029e1a413eb8 -- 00:01:50 00:18:38 200 GET
0000029e1a416a48 -- 00:01:50 00:18:53 200 GET
0000029e1a41c888 -- 00:01:50 00:19:13 200 GET
0000029e1a4288c0 -- 00:01:50 00:19:13 200 GET
0000029e1a442118 -- 00:01:50 00:19:13 200 GET
0000029e1a48b098 -- 00:01:50 00:18:53 200 GET
0000029e1a51edc0 -- 00:01:50 00:14:15 200 GET
0000029e1a52a420 -- 00:01:50 00:18:04 200 GET
0000029e1a55bb48 -- 00:01:50 00:19:12 200 GET
In this case, there are no request queues, high threads contention or something like this, just black hole and spikes by IIS
connections.
I noticed that the generational size of GC
is increasing at the moment the problem starts to arise. Maybe I can get something by analyzing a bunch.
This answer can be divided into two parts:
[Part 1]
It is noticed that the size of generations increases at the moment when the problem begins to arise. See the screenshot from the question. Therefore, we can try to analyze the heap for anomalies that will help us find the cause of the problem.
WinDbg
command output has been shortened to fit the symbols limit.
Having the full dump of the process, the list of HTTP
requests that were not executed at the time of the problem was retrieved:
0:000> !aspxpagesext -n -nq
Address Completed Timeout Time (secs) ThreadId ReturnCode Verb
AS 00000284fcc095c8 no 110 4870 200 GET
AS 00000286000192a8 no 110 1858 200 GET
AS 00000286000242e0 no 110 1854 200 GET
AS 000002860002ff78 no 110 1851 200 GET
AS 00000286000433c8 no 110 1829 200 GET
AS 00000286000527d8 no 110 1824 200 GET
AS 000002860006caa0 no 110 1074 200 GET
AS 00000286004d26c0 no 110 1561 200 GET
AS 00000286005386f0 no 110 1557 200 GET
AS 00000286005d5090 no 110 1546 200 GET
AS 00000286005edb68 no 110 1381 200 GET
AS 0000028600603290 no 110 1539 200 GET
AS 0000028600604818 no 110 1539 200 GET
AS 000002860062b640 no 110 1537 200 GET
AS 0000028600636f10 no 110 1369 200 GET
AS 0000028600642a08 no 110 1536 200 GET
AS 000002860065bbc8 no 110 1346 200 GET
AS 0000028600666a30 no 110 1339 200 GET
AS 0000028600673748 no 110 1332 200 GET
AS 000002860067fb28 no 110 1327 200 GET
AS 000002860068e000 no 110 1322 200 GET
AS 00000286006a1450 no 110 1320 200 GET
AS 00000286006b0418 no 110 1320 200 GET
AS 00000286006bfc48 no 110 1318 200 GET
AS 00000286006cf7e8 no 110 1317 200 GET
AS 00000286006e2dd8 no 110 1311 200 GET
AS 00000286006ee748 no 110 1309 200 GET
AS 00000286006fb200 no 110 1308 200 GET
AS 000002860070c870 no 110 1288 200 GET
AS 000002880224d568 no 110 13 200 GET
3038 contexts found (2815 displayed).
AS == possibly async request
SL == session locked
No one request has an associated thread, so we cannot look inside, but we can find all the roots that keep the request objects from garbage collection.
000002880224d568
is the address of one of the last HttpContext
that the server processed for 13 seconds. The analysis of other requests showed the similar result:
0:000> !gcroot 000002880224d568
Thread a38:
000000a31747ed00 00007ffc85d931d3 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
rbp-80: 000000a31747ed40
-> 00000287fc9157e0 System.Threading.Thread
-> 00000285fc8c8438 System.Runtime.Remoting.Contexts.Context
-> 00000285fc8c8150 System.AppDomain
-> 00000285fcc1f9d8 System.EventHandler
-> 00000284fc9205c8 System.Object[]
-> 00000285fc8d6ab8 System.EventHandler
-> 00000285fc8d6530 System.Web.Hosting.HostingEnvironment
-> 00000285fc8d66d8 System.Collections.Hashtable
-> 00000285fc8d6728 System.Collections.Hashtable+bucket[]
-> 00000285fc94fb08 System.Web.Hosting.ApplicationMonitors+AppMonitorRegisteredObject
-> 00000285fc94d9b8 System.Web.Hosting.ApplicationMonitors
-> 00000285fc94d9d0 System.Web.Hosting.AspNetMemoryMonitor
-> 00000285fc94f040 System.Web.Hosting.LowPhysicalMemoryMonitor
-> 00000285fc94f318 System.Threading.Timer
-> 00000285fc94f390 System.Threading.TimerHolder
-> 00000285fc94f338 System.Threading.TimerQueueTimer
-> 00000285fc94d0c0 System.Threading.TimerQueueTimer
-> 00000285fc8e47f8 System.Threading.TimerQueueTimer
-> 00000285fc8e4710 System.Threading.TimerQueueTimer
-> 00000285fc8e4628 System.Threading.TimerQueueTimer
-> 00000285fc8e4540 System.Threading.TimerQueueTimer
-> 00000285fc8e43f8 System.Threading.TimerQueueTimer
-> 00000285fc8db808 System.Threading.TimerQueueTimer
-> 00000285fc8db7a8 System.Threading.TimerCallback
-> 00000285fc8db4e8 System.Web.RequestTimeoutManager
-> 00000285fc8db520 System.Web.Util.DoubleLinkList[]
-> 00000285fc8db690 System.Web.Util.DoubleLinkList
-> 00000286fcaca7b0 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 00000286fcddae98 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 00000284fce820f8 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 00000287fcfd1ac8 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 00000287fd075278 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 0000028801784ed8 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 0000028700fb7f20 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 00000285fec96f00 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 000002860076f5f8 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 000002860162ef58 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 0000028601d98e90 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 000002850246e118 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 000002880224ef58 System.Web.RequestTimeoutManager+RequestTimeoutEntry
-> 000002880224d568 System.Web.HttpContext
HandleTable:
00000284fbca15e0 (pinned handle)
-> 00000288fc893f30 System.Object[]
-> 00000284fc911968 NamespaceClientApi.Models.Helpers.KeyedSemaphoreSlim
-> 00000284fc9119a8 System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[NamespaceClientApi.Models.Helpers.KeyedSemaphoreSlim+SemaphoreWrapper, Project.Library.SomeClientApiModels]]
-> 00000287ff97efc0 System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[NamespaceClientApi.Models.Helpers.KeyedSemaphoreSlim+SemaphoreWrapper, Project.Library.SomeClientApiModels]][]
-> 00000285fd3a4880 NamespaceClientApi.Models.Helpers.KeyedSemaphoreSlim+SemaphoreWrapper
-> 00000285fd3a48b0 System.Threading.SemaphoreSlim
-> 00000285fd3a8698 System.Threading.SemaphoreSlim+TaskNode
-> 00000285fd3b5168 System.Threading.SemaphoreSlim+TaskNode
-> 00000287fcc17908 System.Threading.SemaphoreSlim+TaskNode
-> 00000284fce05cf0 System.Threading.SemaphoreSlim+TaskNode
-> 00000287fcc37988 System.Threading.SemaphoreSlim+TaskNode
-> 00000285fd3cd0e8 System.Threading.SemaphoreSlim+TaskNode
-> 00000286fcdbf498 System.Threading.SemaphoreSlim+TaskNode
-> 00000286fcaab498 System.Threading.SemaphoreSlim+TaskNode
-> 00000287fcd16370 System.Threading.SemaphoreSlim+TaskNode
-> 00000285fd3ee190 System.Threading.SemaphoreSlim+TaskNode
-> 0000028502949e58 System.Action
-> 0000028502949e38 System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner
-> 0000028502949ee8 NamespaceClientApi.Models.Helpers.KeyedSemaphoreSlim+SemaphoreWrapper+<WaitAsync>d__8
-> 0000028502949e98 System.Threading.Tasks.Task`1[[System.IDisposable, mscorlib]]
-> 000002850294a138 System.Threading.Tasks.SynchronizationContextAwaitTaskContinuation
-> 000002850294a038 System.Action
-> 000002850294a018 System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner
-> 000002850294a0c8 NamespaceClientApi.Models.Helpers.MemoryLocalCache+<GetCacheItemAsync>d__9`1[[System.Collections.Generic.HashSet`1[[System.ValueTuple`2[[System.Int64, mscorlib],[System.Int32, mscorlib]], mscorlib]], System.Core]]
-> 000002850294a078 System.Threading.Tasks.Task`1[[System.Collections.Generic.HashSet`1[[System.ValueTuple`2[[System.Int64, mscorlib],[System.Int32, mscorlib]], mscorlib]], System.Core]]
-> 000002850294a338 System.Threading.Tasks.SynchronizationContextAwaitTaskContinuation
-> 000002850294a268 System.Action
-> 000002850294a248 System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner
-> 000002850294a2f8 Namespace.Services.ProjectInfo+<EventFlagsAsync>d__88
-> 000002850294a2a8 System.Threading.Tasks.Task`1[[System.Collections.Generic.HashSet`1[[System.ValueTuple`2[[System.Int64, mscorlib],[System.Int32, mscorlib]], mscorlib]], System.Core]]
-> 000002850294a540 System.Threading.Tasks.SynchronizationContextAwaitTaskContinuation
-> 000002850294a468 System.Action
-> 000002850294a448 System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner
-> 000002850294a4f8 Namespace.Filters.EventPropertiesFilter+<SetEventFlags>d__2
-> 000002850294a4a8 System.Threading.Tasks.Task`1[[Namespace.Filters.EventPropertiesFilter, Project.Library.Some]]
-> 000002850294a740 System.Threading.Tasks.SynchronizationContextAwaitTaskContinuation
-> 000002850294a670 System.Action
-> 000002850294a650 System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner
-> 000002850294a700 Namespace.Services.ProjectInfo+<InitEventFlagsAsync>d__90
-> 000002850294a6b0 System.Threading.Tasks.Task`1[[Namespace.Filters.EventPropertiesFilter, Project.Library.Some]]
-> 000002850294a9f0 System.Threading.Tasks.SynchronizationContextAwaitTaskContinuation
00000284fbcc4ec0 (strong handle)
-> 000002880224dc08 System.Web.RootedObjects
-> 000002880224d568 System.Web.HttpContext
Found 3 unique roots (run '!GCRoot -all' to see all roots).
They are all stuck in the EventFlags
asynchronous method, which uses a cache manager based on SemaphoreSlim . The actual number of System.Threading.SemaphoreSlim+TaskNode
objects is 1543.
From the SemaphoreSlim.cs :
// Task in a linked list of asynchronous waiters
private sealed class TaskNode: Task <bool>, IThreadPoolWorkItem
{
internal TaskNode Prev, Next;
internal TaskNode() : base() {}
[SecurityCritical]
void IThreadPoolWorkItem.ExecuteWorkItem()
{
bool setSuccessfully = TrySetResult(true);
Contract.Assert(setSuccessfully, "Should have been able to complete task");
}
[SecurityCritical]
void IThreadPoolWorkItem.MarkAborted(ThreadAbortException tae)
{
/* nop */
}
}
Task in a linked list of asynchronous waiters
It seems that TaskNode
is create each time with help CreateAndAddAsyncWaiter method when the semaphore is busy with another thread. Thus, a semblance of a queue is created, which was mentioned above:
/// <summary>
/// Asynchronously waits to enter the <see cref="SemaphoreSlim"/>,
/// using a 32-bit signed integer to measure the time interval,
/// while observing a <see cref="T:System.Threading.CancellationToken"/>.
/// </summary>
/// <param name="millisecondsTimeout">
/// The number of milliseconds to wait, or <see cref="Timeout.Infinite"/>(-1) to wait indefinitely.
/// </param>
/// <param name="cancellationToken">The <see cref="T:System.Threading.CancellationToken"/> to observe.</param>
/// <returns>
/// A task that will complete with a result of true if the current thread successfully entered
/// the <see cref="SemaphoreSlim"/>, otherwise with a result of false.
/// </returns>
/// <exception cref="T:System.ObjectDisposedException">The current instance has already been
/// disposed.</exception>
/// <exception cref="ArgumentOutOfRangeException"><paramref name="millisecondsTimeout"/> is a negative number other than -1,
/// which represents an infinite time-out.
/// </exception>
public Task<bool> WaitAsync(int millisecondsTimeout, CancellationToken cancellationToken)
{
CheckDispose();
// Validate input
if (millisecondsTimeout < -1)
{
throw new ArgumentOutOfRangeException(
"totalMilliSeconds", millisecondsTimeout, GetResourceString("SemaphoreSlim_Wait_TimeoutWrong"));
}
// Bail early for cancellation
if (cancellationToken.IsCancellationRequested)
return Task.FromCancellation<bool>(cancellationToken);
lock (m_lockObj)
{
// If there are counts available, allow this waiter to succeed.
if (m_currentCount > 0)
{
--m_currentCount;
if (m_waitHandle != null && m_currentCount == 0) m_waitHandle.Reset();
return s_trueTask;
}
// If there aren't, create and return a task to the caller.
// The task will be completed either when they've successfully acquired
// the semaphore or when the timeout expired or cancellation was requested.
else
{
Contract.Assert(m_currentCount == 0, "m_currentCount should never be negative");
var asyncWaiter = CreateAndAddAsyncWaiter();
return (millisecondsTimeout == Timeout.Infinite && !cancellationToken.CanBeCanceled) ?
asyncWaiter :
WaitUntilCountOrTimeoutAsync(asyncWaiter, millisecondsTimeout, cancellationToken);
}
}
}
OK, now we need to move to the SemaphoreWrapper
object, in the context of which we work with threads. Let's get access to AsynсStateMachine
, which is responsible for the transition between states, then move on to the method, which is "executable":
0:000> !do poi(0000028502949e38+10)
Name: Namespace.Models.Helpers.KeyedSemaphoreSlim+SemaphoreWrapper+<WaitAsync>d__8
MethodTable: 00007ffc28e470a8
EEClass: 00007ffc28e1e9e8
Size: 80(0x50) bytes
File: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\392547e9\cbe69b5a\assembly\dl3\6151fdbb\00ebcf32_4ebad501\Project.Library.SomeModels.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffc85fb9180 4000426 10 System.Int32 1 instance 0 <>1__state
00007ffc28e25f48 4000427 18 ...sable, mscorlib]] 1 instance 0000028502949f00 <>t__builder
00007ffc28e464d8 4000428 8 ...+SemaphoreWrapper 0 instance 00000285fd3a4880 <>4__this
00007ffc85fab338 4000429 30 ...CancellationToken 1 instance 0000028502949f18 cancelToken
00007ffc85fadf70 400042a 38 ...iguredTaskAwaiter 1 instance 0000028502949f20 <>u__1
0:000> !do 00000285fd3a4880
Name: Namespace.Models.Helpers.KeyedSemaphoreSlim+SemaphoreWrapper
MethodTable: 00007ffc28e464d8
EEClass: 00007ffc28e1e698
Size: 48(0x30) bytes
File: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\392547e9\cbe69b5a\assembly\dl3\6151fdbb\00ebcf32_4ebad501\Project.Library.SomeModels.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffc28e46e68 40003fe 8 ...ClientApiModels]] 0 instance 00000285fd3a4840 _parentRelease
00007ffc85fa6fd0 40003ff 10 ...ing.SemaphoreSlim 0 instance 00000285fd3a48b0 _semaphoreSlim
00007ffc85fb9180 4000400 20 System.Int32 1 instance 1547 _useCount
00007ffc85fb6830 4000401 18 System.String 0 instance 00000285fd3a4800 <Key>k__BackingField
_useCount
is a private property of SemaphoreWrapper
that indicates how many attempts to access the semaphore have been made. In this case, 1547, which looks bad. Apparently, we are faced with a deadlock - one of the tasks never finishes in view of which the queue is increasing. As a result, we are witnessing an API freeze and an increase in the size of the GC
generations.
[Part 2]
The following listings are necessary for dumping objects, what is useful for understanding the status of the cache manager at the time the problem occurred:
0:000> !do poi(000002850294a018+10)
Name: Namespace.Models.Helpers.MemoryLocalCache+<GetCacheItemAsync>d__9`1[[System.Collections.Generic.HashSet`1[[System.ValueTuple`2[[System.Int64, mscorlib],[System.Int32, mscorlib]], mscorlib]], System.Core]]
MethodTable: 00007ffc291bbf58
EEClass: 00007ffc28e1ad98
Size: 112(0x70) bytes
File: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\392547e9\cbe69b5a\assembly\dl3\6151fdbb\00ebcf32_4ebad501\Project.Library.SomeModels.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffc85fb9180 4000410 30 System.Int32 1 instance 0 <>1__state
00007ffc85f68e40 4000411 38 ...Canon, mscorlib]] 1 instance 000002850294a100 <>t__builder
00007ffc85fb6830 4000412 8 System.String 0 instance 0000028502949c98 key
00007ffc288c8510 4000413 10 ....MemoryLocalCache 0 instance 000002880225d460 <>4__this
00007ffc85fb6830 4000414 18 System.String 0 instance 00000286fc9bbb30 methodName
00007ffc85fab338 4000415 50 ...CancellationToken 1 instance 000002850294a118 ct
0000000000000000 4000416 20 0 instance 0000028502949bb8 updateMethod
00007ffc85fb9180 4000417 34 System.Int32 1 instance 600 cacheTime
00007ffc858ab3e8 4000418 28 System.IDisposable 0 instance 0000000000000000 <>7__wrap1
00007ffc28e24bc8 4000419 58 ...sable, mscorlib]] 1 instance 000002850294a120 <>u__1
00007ffc85f6d2a0 400041a 60 ...Canon, mscorlib]] 1 instance 000002850294a128 <>u__2
0:000> !do 000002880225d460
Name: Namespace.Models.Helpers.MemoryLocalCache
MethodTable: 00007ffc288c8510
EEClass: 00007ffc288aa6c8
Size: 32(0x20) bytes
File: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\392547e9\cbe69b5a\assembly\dl3\6151fdbb\00ebcf32_4ebad501\Project.Library.SomeModels.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffc85fb6830 40002c7 8 System.String 0 instance 0000000000000000 _dbPrefix
00007ffc8fabf6e8 40002c8 10 ...ching.MemoryCache 0 instance 00000287fc909e10 _cache
00007ffc28e45998 40002c9 20 ...eyedSemaphoreSlim 0 static 00000284fc911968 _lock
0:000> !do 00000284fc911968
Name: Namespace.Models.Helpers.KeyedSemaphoreSlim
MethodTable: 00007ffc28e45998
EEClass: 00007ffc28e1d758
Size: 40(0x28) bytes
File: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\392547e9\cbe69b5a\assembly\dl3\6151fdbb\00ebcf32_4ebad501\Project.Library.SomeModels.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffc85fb6e10 40002c4 8 System.Object 0 instance 00000284fc911990 _lock
00007ffc28e46580 40002c5 10 ...ClientApiModels]] 0 instance 00000284fc9119a8 _wrapperMap
00007ffc85fa0b50 40002c6 18 System.Boolean 1 instance 0 _isDisposed
0:000> !do 00000284fc9119a8
Name: System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[Namespace.Models.Helpers.KeyedSemaphoreSlim+SemaphoreWrapper, Project.Library.SomeModels]]
MethodTable: 00007ffc28e46580
EEClass: 00007ffc85964200
Size: 80(0x50) bytes
File: C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffc85fb9118 400186d 8 System.Int32[] 0 instance 00000287ff97e148 buckets
00007ffc86a89ae0 400186e 10 ...non, mscorlib]][] 0 instance 00000287ff97efc0 entries
00007ffc85fb9180 400186f 38 System.Int32 1 instance 765 count
00007ffc85fb9180 4001870 3c System.Int32 1 instance 39027 version
00007ffc85fb9180 4001871 40 System.Int32 1 instance 763 freeList
00007ffc85fb9180 4001872 44 System.Int32 1 instance 2 freeCount
00007ffc85f77d10 4001873 18 ...Canon, mscorlib]] 0 instance 00000285fc8c9050 comparer
00007ffc85f75060 4001874 20 ...Canon, mscorlib]] 0 instance 0000000000000000 keys
00007ffc85f86870 4001875 28 ...Canon, mscorlib]] 0 instance 0000000000000000 values
00007ffc85fb6e10 4001876 30 System.Object 0 instance 0000000000000000 _syncRoot
0:000> .foreach /pS 16 /ps 1 (addr {!da 00000287ff97efc0}) { .if ($sicmp("${addr}", "null") != 0) { !do poi(${addr}+8) } }
<Note: this object has an invalid CLASS field>
Invalid object
Name: Namespace.Models.Helpers.KeyedSemaphoreSlim+SemaphoreWrapper
MethodTable: 00007ffc28e464d8
EEClass: 00007ffc28e1e698
Size: 48(0x30) bytes
File: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\392547e9\cbe69b5a\assembly\dl3\6151fdbb\00ebcf32_4ebad501\Project.Library.SomeModels.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffc28e46e68 40003fe 8 ...ClientApiModels]] 0 instance 00000285fd3a4840 _parentRelease
00007ffc85fa6fd0 40003ff 10 ...ing.SemaphoreSlim 0 instance 00000285fd3a48b0 _semaphoreSlim
00007ffc85fb9180 4000400 20 System.Int32 1 instance 1547 _useCount
00007ffc85fb6830 4000401 18 System.String 0 instance 00000285fd3a4800 <Key>k__BackingField
Name: Namespace.Models.Helpers.KeyedSemaphoreSlim+SemaphoreWrapper
MethodTable: 00007ffc28e464d8
EEClass: 00007ffc28e1e698
Size: 48(0x30) bytes
File: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\392547e9\cbe69b5a\assembly\dl3\6151fdbb\00ebcf32_4ebad501\Project.Library.SomeModels.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffc28e46e68 40003fe 8 ...ClientApiModels]] 0 instance 00000285fd3cbd18 _parentRelease
00007ffc85fa6fd0 40003ff 10 ...ing.SemaphoreSlim 0 instance 00000285fd3cbd88 _semaphoreSlim
00007ffc85fb9180 4000400 20 System.Int32 1 instance 846 _useCount
00007ffc85fb6830 4000401 18 System.String 0 instance 00000285fd3cbcb0 <Key>k__BackingField
After analyzing the private field _wrapperMap
of the KeyedSemaphoreSlim object, we can understand what keys the problem occurred with, as well as within which endpoint. In this case we have two keys with abnormally large field values of _useCount
field.
1547
0:000> !do 00000285fd3a4800
Name: System.String
MethodTable: 00007ffc85fb6830
EEClass: 00007ffc85896cb8
Size: 62(0x3e) bytes
File: C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String: _EventFlagsAsync_7
Fields:
MT Field Offset Type VT Attr Value Name
00007ffc85fb9180 4000273 8 System.Int32 1 instance 18 m_stringLength
00007ffc85fb79e8 4000274 c System.Char 1 instance 5f m_firstChar
00007ffc85fb6830 4000278 a0 System.String 0 shared static Empty
>> Domain:Value 00000284fc4e9070:NotInit 00000284fc600120:NotInit <
846
0:000> !do 00000285fd3cbcb0
Name: System.String
MethodTable: 00007ffc85fb6830
EEClass: 00007ffc85896cb8
Size: 102(0x66) bytes
File: C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String: _GetLiveMenuCacheAsync_13
Fields:
MT Field Offset Type VT Attr Value Name
00007ffc85fb9180 4000273 8 System.Int32 1 instance 38 m_stringLength
00007ffc85fb79e8 4000274 c System.Char 1 instance 5f m_firstChar
00007ffc85fb6830 4000278 a0 System.String 0 shared static Empty
>> Domain:Value 00000284fc4e9070:NotInit 00000284fc600120:NotInit <<
We can conclude that the problem occurs in the context of the EventFlagsAsync
method, which is called as part of GetLiveMenuCacheAsync
- an endpoint with almost the same name. Let's see what EventFlagsAsync
is and how it is called:
internal async Task<EventPropertiesFilter> InitEventFlagsAsync()
{
return await _eventProperties.SetEventFlags(EventFlagsAsync(_days));
}
public async Task<HashSet<ValueTuple<long, int>>> EventFlagsAsync(int days)
{
return await _cache.GetCacheItemAsync<HashSet<ValueTuple<long, int>>>(days.ToString(),
async () => await GetEventFlagsAsync(days), (int)MemoryLocalCache.CacheTimeSec.Min10, CancellationToken.None);
}
private async Task<HashSet<ValueTuple<long, int>>> GetEventFlagsAsync(int days)
{
HashSet<ValueTuple<long, int>> ecs = new HashSet<ValueTuple<long, int>>();
foreach (var ef in await _dbConnection.QueryAsync<ValueTuple<long, int>>(Query_EventFlags,
new { StartDate = DateTime.UtcNow.AddHours(-6), EndDate = DateTime.UtcNow.AddDays(days) }))
{
ecs.Add(ef);
}
return ecs;
}
public async Task<EventPropertiesFilter> SetEventFlags(Task<HashSet<ValueTuple<long, int>>> eventFlags)
{
_eventFlags = _eventFlags ?? await eventFlags;
return this;
}
As a result, we come to the source code line in which incorrect work with the Task
is performed, which led us to this story:
_eventFlags = _eventFlags ?? await eventFlags;
UPD
There is an extension that makes working with asynchronous methods easier - !dumpasync .
The !dumpasync command is a WinDbg extension that produces a list of async method callstacks that may not appear on thread callstacks. This information is useful when diagnosing the cause for an app hang that is waiting for async methods to complete.
Unfortunately at that time it did not work with my dump, so I had to act without it, although the problem may not be relevant (I didn't have time to touch on this topic more closely, although it's worth it).
Since it's a post release thing, I'd suggest the following.
From your version control, find the changed calls to limit the investigation area and find where the new problem stems from.
Check this answer here for tracking long running requests . It is most probable that you have some requests waiting for a great amount of time (or there are multiple of these and don't work).
Especially if you are using HttpClient
. check this one here . You might even be exhausting ports.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.