[英]Find who holds a SRW Lock for a native process with WinDbg
I have a program written by c++ and I have trouble finding which thread has acquired the Slim Reader/Writer (SRW) Locks .我有一个由 c++ 编写的程序,但我无法找到哪个线程获得了Slim Reader/Writer (SRW) Locks 。 I googled and found Determining which method is holding a ReaderWriterLockSlim WriteLock , but it's about a program written by C#.
我用谷歌搜索并发现确定哪个方法持有 ReaderWriterLockSlim WriteLock ,但它是关于由 C# 编写的程序。 Besides, some commands, for example,
.rwlock
, are unavaiable.此外,某些命令(例如
.rwlock
)不可用。
0:796> !handle 0 ff Mutant
Handle c
Type Mutant
Attributes 0
GrantedAccess 0x1f0001:
Delete,ReadControl,WriteDac,WriteOwner,Synch
QueryState
HandleCount 4
PointerCount 103240
Name \BaseNamedObjects\DBWinMutex
Object Specific Information
Mutex is Free
Handle 474
Type Mutant
Attributes 0
GrantedAccess 0x1f0001:
Delete,ReadControl,WriteDac,WriteOwner,Synch
QueryState
HandleCount 2
PointerCount 65536
Name \BaseNamedObjects\SM0:928:304:WilStaging_02
Object Specific Information
Mutex is Free
2 handles of type Mutant
0:796> kb
RetAddr : Args to Child : Call Site
00007ff9`b6e3d33a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!ZwWaitForAlertByThreadId+0x14
00007ff9`a85726a9 : 00000000`00000000 00000000`00000000 00000192`83338180 00000000`00000000 : ntdll!RtlAcquireSRWLockExclusive+0x13a
00007ff9`a6231724 : c000000d`00000000 00000000`00000000 00000192`83338180 00000000`00000002 : MSVCP140!mtx_do_lock+0x7d [d:\agent\_work\2\s\src\vctools\crt\crtw32\stdcpp\thr\mutex.cpp @ 106]
00007ff9`a626749e : 00000192`f6a26e38 00000193`4aaa3d80 00000052`897fea60 00000000`00000000 : AZSDK!AZConnection::Post+0x54 [g:\prod\sdk\src\connection.cpp @ 1147]
...
00007ff9`9c8ba9c1 : 00000192`c3b3d770 00000000`00000000 00000192`f5d616b0 00000000`00000000 : prod!Task::Execute+0x28 [g:\prod\src\task.cpp @ 51]
00007ff9`b6e97529 : 00000193`491b9830 00000000`7ffe0386 00000052`897ff998 00000193`491b98f8 : prod!Proxy::TaskExecuter+0x11 [g:\prod\src\proxy.cpp @ 2042]
00007ff9`b6e3bec4 : 00000000`00000000 00000192`f1dd03a0 00000000`00000000 00000000`00000000 : ntdll!TppSimplepExecuteCallback+0x99
00007ff9`b6c47e94 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!TppWorkerThread+0x644
00007ff9`b6e87ad1 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14
00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21
0:796> !rwlock
No export rwlock found
C++ code snippet: C++ 代码片段:
std::mutex m_mutex;
Status AZConnection::Post(const Request* request, Result** pResult)
{
std::lock_guard<std::mutex> sbguard(m_mutex);
}
Updated:更新:
According to rustyx's answer , I see.根据rustyx 的回答,我明白了。 Now I have to give up.
现在我不得不放弃。
In fact, my program is still running but it's out of service.事实上,我的程序仍在运行,但已停止服务。 I have to find the cause.
我必须找到原因。 I found there are 806 threads and most of them are anxiously awaiting
Post
.我发现有 806 个线程,其中大部分都在焦急地等待
Post
。 Besides ,in case it doesn't reproduce, I cannot restart to add a log printing who has aquired the lock.此外,如果它不重现,我无法重新启动以添加获得锁定的日志打印。 Hence, I just want to inspect what the thread holding the lock is doning.
因此,我只想检查持有锁的线程在做什么。
The Win32 native SRWLock does not keep that information. Win32 本机 SRWLock 不保留该信息。 In an uncontended state it's just an atomic flag.
在无竞争状态下,它只是一个原子标志。
For this reason there is no WinDbg command that can do this.出于这个原因,没有可以执行此操作的 WinDbg 命令。
When there is a contention, a wait queue is formed from the threads that are waiting.当发生争用时,等待队列由正在等待的线程组成。 Still, no information is available about the thread that is holding the lock.
仍然没有关于持有锁的线程的可用信息。
For more details about the SRWLock implementation, refer to this answer .有关 SRWLock 实现的更多详细信息,请参阅此答案。
I can use !foreachframe
and !if
in MEX Debugging Extension for WinDbg to grep the callstack and execute a command (multiple commands separated by ; (Command Separator) are not supported) to find the thread who is not watting for the lock but whose previous call is Post
.我可以在WinDbg 的 MEX 调试扩展中使用
!foreachframe
和!if
来 grep 调用堆栈并执行命令(不支持由;(命令分隔符)分隔的多个命令)来查找不为锁定而加电但其先前的线程电话是Post
。 The extension can be downloaded from here .该扩展可以从 这里下载。 After downloading, it can be put in
C:\\WinDDK\\7600.16385.1\\Debuggers\\winext
(see also Loading Debugger Extension DLLs ).下载后,可以放在
C:\\WinDDK\\7600.16385.1\\Debuggers\\winext
(另请参见加载调试器扩展DLL )。
I replaced MSVCP140!mtx_do_lock
with msvcrt!_threadstartex
and replaced AZSDK!AZConnection::Post
with KERNEL32!BaseThreadInitThunk
in the following code as an example:我用
MSVCP140!mtx_do_lock
替换了msvcrt!_threadstartex
并用KERNEL32!BaseThreadInitThunk
替换了AZSDK!AZConnection::Post
作为示例:
~*e r @$t0 = -1; !foreachframe -q -f 'KERNEL32!BaseThreadInitThunk' r @$t0= @#FrameNum - 1; .if(0<=@$t0) { !if -DoesNotContainRegex 'msvcrt!_threadstartex' -then '.printf /D "Thread: <link cmd=\"~~[%x]\">0x%x</link> (<link cmd=\"!mex.t %d\">%d</link>)", $tid, $tid, $dtid, $dtid' .frame @$t0 }
!foreachthread -q !foreachframe -q -f 'KERNEL32!BaseThreadInitThunk' !if -DoesNotContainRegex 'msvcrt!_threadstartex' -then '.printf /D "Thread: <link cmd=\"~~[%x]\">0x%x</link> (<link cmd=\"!mex.t %d\">%d</link>)", $tid, $tid, $dtid, $dtid' .frame @#FrameNum - 1
Its output example for a notepad
process: notepad
进程的输出示例:
Thread: 0xd14c (0)
Thread: 0x4f88 (1)
Thread: 0xd198 (7)
Callstacks of All threads所有线程的调用栈
0:001> !foreachthread k
Child-SP RetAddr Call Site
00000062`fabaf8c8 00007ffd`54c7409d win32u!ZwUserGetMessage+0x14
00000062`fabaf8d0 00007ff7`bb4c449f USER32!GetMessageW+0x2d
00000062`fabaf930 00007ff7`bb4dae07 notepad+0x449f
00000062`fabafa30 00007ffd`570f7974 notepad+0x1ae07
00000062`fabafaf0 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fabafb20 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Changing to thread: 0xda94 (1)
Child-SP RetAddr Call Site
00000062`fae7fbc8 00007ffd`5847f01b ntdll!DbgBreakPoint
00000062`fae7fbd0 00007ffd`570f7974 ntdll!DbgUiRemoteBreakin+0x4b
00000062`fae7fc00 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fae7fc30 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Changing to thread: 0xdb60 (7)
Child-SP RetAddr Call Site
00000062`fb17f968 00007ffd`54c7004d win32u!ZwUserMsgWaitForMultipleObjectsEx+0x14
00000062`fb17f970 00007ffc`e4e7d078 USER32!MsgWaitForMultipleObjectsEx+0x9d
00000062`fb17f9b0 00007ffc`e4e7cec2 DUser!GetMessageExA+0x2f8
00000062`fb17fa50 00007ffd`54c77004 DUser!GetMessageExA+0x142
00000062`fb17fab0 00007ffd`584534a4 USER32!Ordinal2582+0x64
00000062`fb17fb50 00007ffd`54101164 ntdll!KiUserCallbackDispatcher+0x24
00000062`fb17fbc8 00007ffd`54c7409d win32u!ZwUserGetMessage+0x14
00000062`fb17fbd0 00007ffd`2e4efa3c USER32!GetMessageW+0x2d
00000062`fb17fc30 00007ffd`1d0b30f8 DUI70!StartMessagePump+0x3c
00000062`fb17fc90 00007ffd`1d0b31ce msctfuimanager!DllCanUnloadNow+0xf3e8
00000062`fb17fd50 00007ffd`570f7974 msctfuimanager!DllCanUnloadNow+0xf4be
00000062`fb17fd80 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fb17fdb0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Changing to thread: 0xc490 (8)
Child-SP RetAddr Call Site
00000062`fb1ff708 00007ffd`54c7004d win32u!ZwUserMsgWaitForMultipleObjectsEx+0x14
00000062`fb1ff710 00007ffc`e4e7d1ca USER32!MsgWaitForMultipleObjectsEx+0x9d
00000062`fb1ff750 00007ffc`e4e7cde7 DUser!GetMessageExA+0x44a
00000062`fb1ff7f0 00007ffc`e4e7ca53 DUser!GetMessageExA+0x67
00000062`fb1ff840 00007ffd`5505b0ea DUser!GetGadgetFocus+0x33b3
00000062`fb1ff8d0 00007ffd`5505b1bc msvcrt!_callthreadstartex+0x1e
00000062`fb1ff900 00007ffd`570f7974 msvcrt!_threadstartex+0x7c
00000062`fb1ff930 00007ffd`5841a261 KERNEL32!BaseThreadInitThunk+0x14
00000062`fb1ff960 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Updated:更新:
I found Someone thought the same way (in case of dead links, frequent):我发现 有人以同样的方式思考(在死链接的情况下,频繁):
Slim reader/writer locks don't remember who the owners are, so you'll have to find them some other way
细长的读写器锁不记得主人是谁,所以你必须通过其他方式找到它们
Raymond |
雷蒙德| August 10th, 2011
2011 年 8 月 10 日
The slim reader/writer lock is a very convenient synchronization facility, but one of the downsides is that it doesn't > keep track of who the current owners are.
纤薄的读/写锁是一种非常方便的同步工具,但缺点之一是它不能 > 跟踪当前所有者是谁。
When your thread is stuck waiting to acquire a slim reader/writer lock, a natural thing to want to know is which threads own the resource your stuck thread > waiting for.当您的线程卡在等待获取细长的读/写锁时,很自然地想知道哪些线程拥有您卡住的线程 > 正在等待的资源。
Since there's not facility for going from the waiting thread to the owning threads, you'll just have to find the owning threads some other way.
由于没有从等待线程到拥有线程的便利,您只需要以其他方式找到拥有线程。 Here's the thread > that is waiting for the lock in shared mode:
这是在共享模式下等待锁定的线程 >:
ntdll!ZwWaitForKeyedEvent+0xc ntdll!RtlAcquireSRWLockShared+0x126 dbquery!CSearchSpace::Validate+0x10b dbquery!CSearchSpace::DecomposeSearchSpace+0x3c dbquery!CQuery::AddConfigs+0xdc dbquery!CQuery::ResolveProviders+0x89 dbquery!CResults::CreateProviders+0x85 dbquery!CResults::GetProviders+0x61 dbquery!CResults::CreateResults+0x11c
Okay, how do you find the thread that owns the lock?
好的,你如何找到拥有锁的线程?
First, slim reader/writer locks are usable only within a process, so the candidate threads are the one within the process.
首先,纤薄的读/写锁只能在一个进程内使用,所以候选线程是进程内的一个。
Second, the usage pattern for locks is nearly always something like
其次,锁的使用模式几乎总是类似于
enter lock do something exit lock
It is highly unusual for a function to take a lock and exit to
一个函数获取锁并退出是非常不寻常的
external code with the lock held.持有锁的外部代码。 (It might exit to other code within the same component, transferring the obligation to exit the lock to that other code.)
(它可能会退出到同一组件中的其他代码,将退出锁的义务转移到其他代码。)
Therefore, you want to look for threads that are still insidedbquery.dll
, possibly even still insideCSearchSpace
(if the lock is a per-object lock rather > than a global one).因此,您要查找仍在
dbquery.dll
线程,甚至可能仍在CSearchSpace
(如果锁是针对每个对象的锁而不是全局锁)。Of course, the possibility might be that the code that entered the lock messed up and forgot to release it, but if that's the case, no amount of searching for it > will find anything since the culprit is long gone.
当然,也有可能是进入锁的代码搞砸了,忘记释放了,但如果是这样的话,再怎么搜索都找不到>,因为罪魁祸首早已不复存在。
Since debugging is an exercise in optimism , we may as well proceed on the assumption that > we're not in the case.由于调试是一种乐观的练习,我们不妨假设 > 我们不是这种情况。 If it fails to find the lock owner, then we may have to revisit the assumption.
如果它找不到锁的所有者,那么我们可能不得不重新考虑这个假设。
Finally, the last trick is knowing which threads to ignore .
最后,最后一个技巧是知道要忽略哪些线程。
For now, you can also ignore the threads that are waiting for the lock, since they are the victims not the cause.
现在,您还可以忽略等待锁定的线程,因为它们是受害者而不是原因。 (Again, if we fail to find the lock owner, we > can revisit the assumption that they are not the cause; for example, they may be attempting to acquire the lock recursively.)
(同样,如果我们找不到锁的所有者,我们可以重新考虑他们不是原因的假设;例如,他们可能试图递归地获取锁。)
As it happens,
当它发生的时候,
there is only one thread in the process that passes all the above filters.进程中只有一个线程通过了上述所有过滤器。
dbquery!CProp::Marshall+0x3b dbquery!CRequest::CRequest+0x24c dbquery!CQuery::Execute+0x668 dbquery!CResults::FillParams+0x1c4 dbquery!CResults::AddProvider+0x4e dbquery!CResults::AddConfigs+0x1c5 dbquery!CResults::CreateResults+0x145
This may not be the source of the problem, but it's a good start.
这可能不是问题的根源,但这是一个好的开始。
(Actually, it looks very promising since the problem is probably(实际上,看起来很有希望,因为问题很可能是
that the process on the other side of the marshaller is stuck.)编组器另一侧的进程卡住了。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.