简体   繁体   English

使用 WinDbg 查找谁持有本机进程的 SRW 锁

[英]Find who holds a SRW Lock for a native process with WinDbg

I have a program written by c++ and I have trouble finding which thread has acquired the Slim Reader/Writer (SRW) Locks .我有一个由 c++ 编写的程序,但我无法找到哪个线程获得了Slim Reader/Writer (SRW) Locks I googled and found Determining which method is holding a ReaderWriterLockSlim WriteLock , but it's about a program written by C#.我用谷歌搜索并发现确定哪个方法持有 ReaderWriterLockSlim WriteLock ,但它是关于由 C# 编写的程序。 Besides, some commands, for example, .rwlock , are unavaiable.此外,某些命令(例如.rwlock )不可用。

0:796> !handle 0 ff Mutant
Handle c
  Type          Mutant
  Attributes    0
  GrantedAccess 0x1f0001:
         Delete,ReadControl,WriteDac,WriteOwner,Synch
         QueryState
  HandleCount   4
  PointerCount  103240
  Name          \BaseNamedObjects\DBWinMutex
  Object Specific Information
    Mutex is Free
Handle 474
  Type          Mutant
  Attributes    0
  GrantedAccess 0x1f0001:
         Delete,ReadControl,WriteDac,WriteOwner,Synch
         QueryState
  HandleCount   2
  PointerCount  65536
  Name          \BaseNamedObjects\SM0:928:304:WilStaging_02
  Object Specific Information
    Mutex is Free
2 handles of type Mutant
0:796> kb
RetAddr           : Args to Child                                                           : Call Site
00007ff9`b6e3d33a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!ZwWaitForAlertByThreadId+0x14
00007ff9`a85726a9 : 00000000`00000000 00000000`00000000 00000192`83338180 00000000`00000000 : ntdll!RtlAcquireSRWLockExclusive+0x13a
00007ff9`a6231724 : c000000d`00000000 00000000`00000000 00000192`83338180 00000000`00000002 : MSVCP140!mtx_do_lock+0x7d [d:\agent\_work\2\s\src\vctools\crt\crtw32\stdcpp\thr\mutex.cpp @ 106]
00007ff9`a626749e : 00000192`f6a26e38 00000193`4aaa3d80 00000052`897fea60 00000000`00000000 : AZSDK!AZConnection::Post+0x54 [g:\prod\sdk\src\connection.cpp @ 1147]
...
00007ff9`9c8ba9c1 : 00000192`c3b3d770 00000000`00000000 00000192`f5d616b0 00000000`00000000 : prod!Task::Execute+0x28 [g:\prod\src\task.cpp @ 51]
00007ff9`b6e97529 : 00000193`491b9830 00000000`7ffe0386 00000052`897ff998 00000193`491b98f8 : prod!Proxy::TaskExecuter+0x11 [g:\prod\src\proxy.cpp @ 2042]
00007ff9`b6e3bec4 : 00000000`00000000 00000192`f1dd03a0 00000000`00000000 00000000`00000000 : ntdll!TppSimplepExecuteCallback+0x99
00007ff9`b6c47e94 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!TppWorkerThread+0x644
00007ff9`b6e87ad1 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14
00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21
0:796> !rwlock
No export rwlock found

C++ code snippet: C++ 代码片段:

std::mutex m_mutex;

Status AZConnection::Post(const Request* request, Result** pResult)
{
    std::lock_guard<std::mutex> sbguard(m_mutex);
}

Updated:更新:

According to rustyx's answer , I see.根据rustyx 的回答,我明白了。 Now I have to give up.现在我不得不放弃。

In fact, my program is still running but it's out of service.事实上,我的程序仍在运行,但已停止服务。 I have to find the cause.我必须找到原因。 I found there are 806 threads and most of them are anxiously awaiting Post .我发现有 806 个线程,其中大部分都在焦急地等待Post Besides ,in case it doesn't reproduce, I cannot restart to add a log printing who has aquired the lock.此外,如果它不重现,我无法重新启动以添加获得锁定的日志打印。 Hence, I just want to inspect what the thread holding the lock is doning.因此,我只想检查持有锁的线程在做什么。

The Win32 native SRWLock does not keep that information. Win32 本机 SRWLock 不保留该信息。 In an uncontended state it's just an atomic flag.在无竞争状态下,它只是一个原子标志。

For this reason there is no WinDbg command that can do this.出于这个原因,没有可以执行此操作的 WinDbg 命令。

When there is a contention, a wait queue is formed from the threads that are waiting.当发生争用时,等待队列由正在等待的线程组成。 Still, no information is available about the thread that is holding the lock.仍然没有关于持有锁的线程的可用信息。

For more details about the SRWLock implementation, refer to this answer .有关 SRWLock 实现的更多详细信息,请参阅此答案

I can use !foreachframe and !if in MEX Debugging Extension for WinDbg to grep the callstack and execute a command (multiple commands separated by ; (Command Separator) are not supported) to find the thread who is not watting for the lock but whose previous call is Post .我可以在WinDbg 的 MEX 调试扩展中使用!foreachframe!if来 grep 调用堆栈并执行命令(不支持由;(命令分隔符)分隔的多个命令)来查找不为锁定而加电但其先前的线程电话是Post The extension can be downloaded from here .该扩展可以从 这里下载。 After downloading, it can be put in C:\\WinDDK\\7600.16385.1\\Debuggers\\winext (see also Loading Debugger Extension DLLs ).下载后,可以放在C:\\WinDDK\\7600.16385.1\\Debuggers\\winext (另请参见加载调试器扩展DLL )。

I replaced MSVCP140!mtx_do_lock with msvcrt!_threadstartex and replaced AZSDK!AZConnection::Post with KERNEL32!BaseThreadInitThunk in the following code as an example:我用MSVCP140!mtx_do_lock替换了msvcrt!_threadstartex并用KERNEL32!BaseThreadInitThunk替换了AZSDK!AZConnection::Post作为示例:

~*e r @$t0 = -1; !foreachframe -q -f 'KERNEL32!BaseThreadInitThunk' r @$t0= @#FrameNum - 1; .if(0<=@$t0) { !if -DoesNotContainRegex 'msvcrt!_threadstartex' -then '.printf /D "Thread: <link cmd=\"~~[%x]\">0x%x</link> (<link cmd=\"!mex.t %d\">%d</link>)", $tid, $tid, $dtid, $dtid' .frame @$t0 }

!foreachthread -q !foreachframe -q -f 'KERNEL32!BaseThreadInitThunk' !if -DoesNotContainRegex 'msvcrt!_threadstartex' -then '.printf /D "Thread: <link cmd=\"~~[%x]\">0x%x</link> (<link cmd=\"!mex.t %d\">%d</link>)", $tid, $tid, $dtid, $dtid' .frame @#FrameNum - 1

Its output example for a notepad process: notepad进程的输出示例:

Thread: 0xd14c (0)
Thread: 0x4f88 (1)
Thread: 0xd198 (7)

Callstacks of All threads所有线程的调用栈

0:001> !foreachthread k
Child-SP          RetAddr           Call Site
00000062`fabaf8c8 00007ffd`54c7409d win32u!ZwUserGetMessage+0x14
00000062`fabaf8d0 00007ff7`bb4c449f USER32!GetMessageW+0x2d
00000062`fabaf930 00007ff7`bb4dae07 notepad+0x449f
00000062`fabafa30 00007ffd`570f7974 notepad+0x1ae07
00000062`fabafaf0 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fabafb20 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Changing to thread: 0xda94 (1)
Child-SP          RetAddr           Call Site
00000062`fae7fbc8 00007ffd`5847f01b ntdll!DbgBreakPoint
00000062`fae7fbd0 00007ffd`570f7974 ntdll!DbgUiRemoteBreakin+0x4b
00000062`fae7fc00 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fae7fc30 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Changing to thread: 0xdb60 (7)
Child-SP          RetAddr           Call Site
00000062`fb17f968 00007ffd`54c7004d win32u!ZwUserMsgWaitForMultipleObjectsEx+0x14
00000062`fb17f970 00007ffc`e4e7d078 USER32!MsgWaitForMultipleObjectsEx+0x9d
00000062`fb17f9b0 00007ffc`e4e7cec2 DUser!GetMessageExA+0x2f8
00000062`fb17fa50 00007ffd`54c77004 DUser!GetMessageExA+0x142
00000062`fb17fab0 00007ffd`584534a4 USER32!Ordinal2582+0x64
00000062`fb17fb50 00007ffd`54101164 ntdll!KiUserCallbackDispatcher+0x24
00000062`fb17fbc8 00007ffd`54c7409d win32u!ZwUserGetMessage+0x14
00000062`fb17fbd0 00007ffd`2e4efa3c USER32!GetMessageW+0x2d
00000062`fb17fc30 00007ffd`1d0b30f8 DUI70!StartMessagePump+0x3c
00000062`fb17fc90 00007ffd`1d0b31ce msctfuimanager!DllCanUnloadNow+0xf3e8
00000062`fb17fd50 00007ffd`570f7974 msctfuimanager!DllCanUnloadNow+0xf4be
00000062`fb17fd80 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fb17fdb0 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Changing to thread: 0xc490 (8)
Child-SP          RetAddr           Call Site
00000062`fb1ff708 00007ffd`54c7004d win32u!ZwUserMsgWaitForMultipleObjectsEx+0x14
00000062`fb1ff710 00007ffc`e4e7d1ca USER32!MsgWaitForMultipleObjectsEx+0x9d
00000062`fb1ff750 00007ffc`e4e7cde7 DUser!GetMessageExA+0x44a
00000062`fb1ff7f0 00007ffc`e4e7ca53 DUser!GetMessageExA+0x67
00000062`fb1ff840 00007ffd`5505b0ea DUser!GetGadgetFocus+0x33b3
00000062`fb1ff8d0 00007ffd`5505b1bc msvcrt!_callthreadstartex+0x1e
00000062`fb1ff900 00007ffd`570f7974 msvcrt!_threadstartex+0x7c
00000062`fb1ff930 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fb1ff960 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Updated:更新:

I found Someone thought the same way (in case of dead links, frequent):我发现 有人以同样的方式思考(在死链接的情况下,频繁):

Slim reader/writer locks don't remember who the owners are, so you'll have to find them some other way细长的读写器锁不记得主人是谁,所以你必须通过其他方式找到它们

Raymond |雷蒙德| August 10th, 2011 2011 年 8 月 10 日

The slim reader/writer lock is a very convenient synchronization facility, but one of the downsides is that it doesn't > keep track of who the current owners are.纤薄的读/写锁是一种非常方便的同步工具,但缺点之一是它不能 > 跟踪当前所有者是谁。
When your thread is stuck waiting to acquire a slim reader/writer lock, a natural thing to want to know is which threads own the resource your stuck thread > waiting for.当您的线程卡在等待获取细长的读/写锁时,很自然地想知道哪些线程拥有您卡住的线程 > 正在等待的资源。

Since there's not facility for going from the waiting thread to the owning threads, you'll just have to find the owning threads some other way.由于没有从等待线程到拥有线程的便利,您只需要以其他方式找到拥有线程。 Here's the thread > that is waiting for the lock in shared mode:这是在共享模式下等待锁定的线程 >:

 ntdll!ZwWaitForKeyedEvent+0xc ntdll!RtlAcquireSRWLockShared+0x126 dbquery!CSearchSpace::Validate+0x10b dbquery!CSearchSpace::DecomposeSearchSpace+0x3c dbquery!CQuery::AddConfigs+0xdc dbquery!CQuery::ResolveProviders+0x89 dbquery!CResults::CreateProviders+0x85 dbquery!CResults::GetProviders+0x61 dbquery!CResults::CreateResults+0x11c

Okay, how do you find the thread that owns the lock?好的,你如何找到拥有锁的线程?

First, slim reader/writer locks are usable only within a process, so the candidate threads are the one within the process.首先,纤薄的读/写锁只能在一个进程内使用,所以候选线程是进程内的一个。

Second, the usage pattern for locks is nearly always something like其次,锁的使用模式几乎总是类似于

enter lock do something exit lock

It is highly unusual for a function to take a lock and exit to一个函数获取锁并退出是非常不寻常的
external code with the lock held.持有锁的外部代码。 (It might exit to other code within the same component, transferring the obligation to exit the lock to that other code.) (它可能会退出到同一组件中的其他代码,将退出锁的义务转移到其他代码。)
Therefore, you want to look for threads that are still inside dbquery.dll , possibly even still inside CSearchSpace (if the lock is a per-object lock rather > than a global one).因此,您要查找仍在dbquery.dll线程,甚至可能仍在CSearchSpace (如果锁是针对每个对象的锁而不是全局锁)。

Of course, the possibility might be that the code that entered the lock messed up and forgot to release it, but if that's the case, no amount of searching for it > will find anything since the culprit is long gone.当然,也有可能是进入锁的代码搞砸了,忘记释放了,但如果是这样的话,再怎么搜索都找不到>,因为罪魁祸首早已不复存在。
Since debugging is an exercise in optimism , we may as well proceed on the assumption that > we're not in the case.由于调试是一种乐观的练习,我们不妨假设 > 我们不是这种情况。 If it fails to find the lock owner, then we may have to revisit the assumption.如果它找不到锁的所有者,那么我们可能不得不重新考虑这个假设。

Finally, the last trick is knowing which threads to ignore .最后,最后一个技巧是知道要忽略哪些线程

For now, you can also ignore the threads that are waiting for the lock, since they are the victims not the cause.现在,您还可以忽略等待锁定的线程,因为它们是受害者而不是原因。 (Again, if we fail to find the lock owner, we > can revisit the assumption that they are not the cause; for example, they may be attempting to acquire the lock recursively.) (同样,如果我们找不到锁的所有者,我们可以重新考虑他们不是原因的假设;例如,他们可能试图递归地获取锁。)

As it happens,当它发生的时候,
there is only one thread in the process that passes all the above filters.进程中只有一个线程通过了上述所有过滤器。

 dbquery!CProp::Marshall+0x3b dbquery!CRequest::CRequest+0x24c dbquery!CQuery::Execute+0x668 dbquery!CResults::FillParams+0x1c4 dbquery!CResults::AddProvider+0x4e dbquery!CResults::AddConfigs+0x1c5 dbquery!CResults::CreateResults+0x145

This may not be the source of the problem, but it's a good start.这可能不是问题的根源,但这是一个好的开始。
(Actually, it looks very promising since the problem is probably (实际上,看起来很有希望,因为问题很可能是
that the process on the other side of the marshaller is stuck.)编组器另一侧的进程卡住了。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM