简体   繁体   中英

Find who holds a SRW Lock for a native process with WinDbg

I have a program written by c++ and I have trouble finding which thread has acquired the Slim Reader/Writer (SRW) Locks . I googled and found Determining which method is holding a ReaderWriterLockSlim WriteLock , but it's about a program written by C#. Besides, some commands, for example, .rwlock , are unavaiable.

0:796> !handle 0 ff Mutant
Handle c
  Type          Mutant
  Attributes    0
  GrantedAccess 0x1f0001:
         Delete,ReadControl,WriteDac,WriteOwner,Synch
         QueryState
  HandleCount   4
  PointerCount  103240
  Name          \BaseNamedObjects\DBWinMutex
  Object Specific Information
    Mutex is Free
Handle 474
  Type          Mutant
  Attributes    0
  GrantedAccess 0x1f0001:
         Delete,ReadControl,WriteDac,WriteOwner,Synch
         QueryState
  HandleCount   2
  PointerCount  65536
  Name          \BaseNamedObjects\SM0:928:304:WilStaging_02
  Object Specific Information
    Mutex is Free
2 handles of type Mutant
0:796> kb
RetAddr           : Args to Child                                                           : Call Site
00007ff9`b6e3d33a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!ZwWaitForAlertByThreadId+0x14
00007ff9`a85726a9 : 00000000`00000000 00000000`00000000 00000192`83338180 00000000`00000000 : ntdll!RtlAcquireSRWLockExclusive+0x13a
00007ff9`a6231724 : c000000d`00000000 00000000`00000000 00000192`83338180 00000000`00000002 : MSVCP140!mtx_do_lock+0x7d [d:\agent\_work\2\s\src\vctools\crt\crtw32\stdcpp\thr\mutex.cpp @ 106]
00007ff9`a626749e : 00000192`f6a26e38 00000193`4aaa3d80 00000052`897fea60 00000000`00000000 : AZSDK!AZConnection::Post+0x54 [g:\prod\sdk\src\connection.cpp @ 1147]
...
00007ff9`9c8ba9c1 : 00000192`c3b3d770 00000000`00000000 00000192`f5d616b0 00000000`00000000 : prod!Task::Execute+0x28 [g:\prod\src\task.cpp @ 51]
00007ff9`b6e97529 : 00000193`491b9830 00000000`7ffe0386 00000052`897ff998 00000193`491b98f8 : prod!Proxy::TaskExecuter+0x11 [g:\prod\src\proxy.cpp @ 2042]
00007ff9`b6e3bec4 : 00000000`00000000 00000192`f1dd03a0 00000000`00000000 00000000`00000000 : ntdll!TppSimplepExecuteCallback+0x99
00007ff9`b6c47e94 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!TppWorkerThread+0x644
00007ff9`b6e87ad1 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14
00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21
0:796> !rwlock
No export rwlock found

C++ code snippet:

std::mutex m_mutex;

Status AZConnection::Post(const Request* request, Result** pResult)
{
    std::lock_guard<std::mutex> sbguard(m_mutex);
}

Updated:

According to rustyx's answer , I see. Now I have to give up.

In fact, my program is still running but it's out of service. I have to find the cause. I found there are 806 threads and most of them are anxiously awaiting Post . Besides ,in case it doesn't reproduce, I cannot restart to add a log printing who has aquired the lock. Hence, I just want to inspect what the thread holding the lock is doning.

The Win32 native SRWLock does not keep that information. In an uncontended state it's just an atomic flag.

For this reason there is no WinDbg command that can do this.

When there is a contention, a wait queue is formed from the threads that are waiting. Still, no information is available about the thread that is holding the lock.

For more details about the SRWLock implementation, refer to this answer .

I can use !foreachframe and !if in MEX Debugging Extension for WinDbg to grep the callstack and execute a command (multiple commands separated by ; (Command Separator) are not supported) to find the thread who is not watting for the lock but whose previous call is Post . The extension can be downloaded from here . After downloading, it can be put in C:\\WinDDK\\7600.16385.1\\Debuggers\\winext (see also Loading Debugger Extension DLLs ).

I replaced MSVCP140!mtx_do_lock with msvcrt!_threadstartex and replaced AZSDK!AZConnection::Post with KERNEL32!BaseThreadInitThunk in the following code as an example:

~*e r @$t0 = -1; !foreachframe -q -f 'KERNEL32!BaseThreadInitThunk' r @$t0= @#FrameNum - 1; .if(0<=@$t0) { !if -DoesNotContainRegex 'msvcrt!_threadstartex' -then '.printf /D "Thread: <link cmd=\"~~[%x]\">0x%x</link> (<link cmd=\"!mex.t %d\">%d</link>)", $tid, $tid, $dtid, $dtid' .frame @$t0 }

!foreachthread -q !foreachframe -q -f 'KERNEL32!BaseThreadInitThunk' !if -DoesNotContainRegex 'msvcrt!_threadstartex' -then '.printf /D "Thread: <link cmd=\"~~[%x]\">0x%x</link> (<link cmd=\"!mex.t %d\">%d</link>)", $tid, $tid, $dtid, $dtid' .frame @#FrameNum - 1

Its output example for a notepad process:

Thread: 0xd14c (0)
Thread: 0x4f88 (1)
Thread: 0xd198 (7)

Callstacks of All threads

0:001> !foreachthread k
Child-SP          RetAddr           Call Site
00000062`fabaf8c8 00007ffd`54c7409d win32u!ZwUserGetMessage+0x14
00000062`fabaf8d0 00007ff7`bb4c449f USER32!GetMessageW+0x2d
00000062`fabaf930 00007ff7`bb4dae07 notepad+0x449f
00000062`fabafa30 00007ffd`570f7974 notepad+0x1ae07
00000062`fabafaf0 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fabafb20 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Changing to thread: 0xda94 (1)
Child-SP          RetAddr           Call Site
00000062`fae7fbc8 00007ffd`5847f01b ntdll!DbgBreakPoint
00000062`fae7fbd0 00007ffd`570f7974 ntdll!DbgUiRemoteBreakin+0x4b
00000062`fae7fc00 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fae7fc30 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Changing to thread: 0xdb60 (7)
Child-SP          RetAddr           Call Site
00000062`fb17f968 00007ffd`54c7004d win32u!ZwUserMsgWaitForMultipleObjectsEx+0x14
00000062`fb17f970 00007ffc`e4e7d078 USER32!MsgWaitForMultipleObjectsEx+0x9d
00000062`fb17f9b0 00007ffc`e4e7cec2 DUser!GetMessageExA+0x2f8
00000062`fb17fa50 00007ffd`54c77004 DUser!GetMessageExA+0x142
00000062`fb17fab0 00007ffd`584534a4 USER32!Ordinal2582+0x64
00000062`fb17fb50 00007ffd`54101164 ntdll!KiUserCallbackDispatcher+0x24
00000062`fb17fbc8 00007ffd`54c7409d win32u!ZwUserGetMessage+0x14
00000062`fb17fbd0 00007ffd`2e4efa3c USER32!GetMessageW+0x2d
00000062`fb17fc30 00007ffd`1d0b30f8 DUI70!StartMessagePump+0x3c
00000062`fb17fc90 00007ffd`1d0b31ce msctfuimanager!DllCanUnloadNow+0xf3e8
00000062`fb17fd50 00007ffd`570f7974 msctfuimanager!DllCanUnloadNow+0xf4be
00000062`fb17fd80 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fb17fdb0 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Changing to thread: 0xc490 (8)
Child-SP          RetAddr           Call Site
00000062`fb1ff708 00007ffd`54c7004d win32u!ZwUserMsgWaitForMultipleObjectsEx+0x14
00000062`fb1ff710 00007ffc`e4e7d1ca USER32!MsgWaitForMultipleObjectsEx+0x9d
00000062`fb1ff750 00007ffc`e4e7cde7 DUser!GetMessageExA+0x44a
00000062`fb1ff7f0 00007ffc`e4e7ca53 DUser!GetMessageExA+0x67
00000062`fb1ff840 00007ffd`5505b0ea DUser!GetGadgetFocus+0x33b3
00000062`fb1ff8d0 00007ffd`5505b1bc msvcrt!_callthreadstartex+0x1e
00000062`fb1ff900 00007ffd`570f7974 msvcrt!_threadstartex+0x7c
00000062`fb1ff930 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fb1ff960 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Updated:

I found Someone thought the same way (in case of dead links, frequent):

Slim reader/writer locks don't remember who the owners are, so you'll have to find them some other way

Raymond | August 10th, 2011

The slim reader/writer lock is a very convenient synchronization facility, but one of the downsides is that it doesn't > keep track of who the current owners are.
When your thread is stuck waiting to acquire a slim reader/writer lock, a natural thing to want to know is which threads own the resource your stuck thread > waiting for.

Since there's not facility for going from the waiting thread to the owning threads, you'll just have to find the owning threads some other way. Here's the thread > that is waiting for the lock in shared mode:

 ntdll!ZwWaitForKeyedEvent+0xc ntdll!RtlAcquireSRWLockShared+0x126 dbquery!CSearchSpace::Validate+0x10b dbquery!CSearchSpace::DecomposeSearchSpace+0x3c dbquery!CQuery::AddConfigs+0xdc dbquery!CQuery::ResolveProviders+0x89 dbquery!CResults::CreateProviders+0x85 dbquery!CResults::GetProviders+0x61 dbquery!CResults::CreateResults+0x11c

Okay, how do you find the thread that owns the lock?

First, slim reader/writer locks are usable only within a process, so the candidate threads are the one within the process.

Second, the usage pattern for locks is nearly always something like

enter lock do something exit lock

It is highly unusual for a function to take a lock and exit to
external code with the lock held. (It might exit to other code within the same component, transferring the obligation to exit the lock to that other code.)
Therefore, you want to look for threads that are still inside dbquery.dll , possibly even still inside CSearchSpace (if the lock is a per-object lock rather > than a global one).

Of course, the possibility might be that the code that entered the lock messed up and forgot to release it, but if that's the case, no amount of searching for it > will find anything since the culprit is long gone.
Since debugging is an exercise in optimism , we may as well proceed on the assumption that > we're not in the case. If it fails to find the lock owner, then we may have to revisit the assumption.

Finally, the last trick is knowing which threads to ignore .

For now, you can also ignore the threads that are waiting for the lock, since they are the victims not the cause. (Again, if we fail to find the lock owner, we > can revisit the assumption that they are not the cause; for example, they may be attempting to acquire the lock recursively.)

As it happens,
there is only one thread in the process that passes all the above filters.

 dbquery!CProp::Marshall+0x3b dbquery!CRequest::CRequest+0x24c dbquery!CQuery::Execute+0x668 dbquery!CResults::FillParams+0x1c4 dbquery!CResults::AddProvider+0x4e dbquery!CResults::AddConfigs+0x1c5 dbquery!CResults::CreateResults+0x145

This may not be the source of the problem, but it's a good start.
(Actually, it looks very promising since the problem is probably
that the process on the other side of the marshaller is stuck.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM