简体繁体 English

backtrace() function 在故障 (SIGSEGV) 信号处理程序期间

[英]backtrace() function during fault (SIGSEGV) signal handler

原文 2011-08-15 19:11:41 8 1 c/ linux/ stack-trace

I have read ( see here ) that "common practice" to print a stack trace using backtrace() during a fault signal handler (eg when handling SIGSEGV ) under Linux is to:我已经阅读（参见此处）在 Linux 下的故障信号处理程序期间（例如，在处理SIGSEGV时）使用backtrace()打印堆栈跟踪的“常见做法”是：

1 Get the instruction pointer ( EIP or RIP ) from the undocumented sigcontext structure. 1 从未记录的sigcontext结构中获取指令指针（ EIP或RIP ）。

2 Replace the 2nd frame in the stack trace with the instruction pointer, since the first frame is the signal handler, and the 2nd frame is supposed to be within libc in the sigaction code, which has overwritten the original frame in which the fault occurred. 2 将堆栈跟踪中的第 2 帧替换为指令指针，因为第 1 帧是信号处理程序，并且第 2 帧应该在sigaction代码中的libc中，它已经覆盖了发生故障的原始帧。

3 Print the backtrace starting from the newly replaced 2nd frame. 3 从新替换的第二帧开始打印回溯。

It seems to me in my testing (on x86_64 2.6 kernel) that in fact the original frame in which the fault occurred is present in the stack trace given by backtrace() in the 3rd frame - the first is the signal handler and the 2nd is in libc signal handling code.在我的测试中（在x86_64 2.6 内核上），在我看来，实际上发生故障的原始帧存在于第三帧中由backtrace()给出的堆栈跟踪中 - 第一个是信号处理程序，第二个是在libc信号处理代码中。

Is this change in kernel signal handling documented somewhere that you can reference for me? kernel 信号处理中的这种变化是否记录在您可以参考的某处？

It seems to me that the upshot is that you can avoid replacing any frames from the instruction pointer, and just print the stack trace from backtrace() starting with frame 3, but I want confirmation that this is known behavior and the correct way to do it.在我看来，结果是您可以避免替换指令指针中的任何帧，而只需从第 3 帧开始从backtrace()打印堆栈跟踪，但我想确认这是已知行为和正确的方法它。

1 个解决方案

This is an interesting thing to try to do, but it's not really portable and probably will never be 100% reliable.这是一件有趣的事情，但它并不是真正可移植的，而且可能永远不会 100% 可靠。 So just implement it the way you say, if that works on your platform, and include a couple little unit tests for it so that you know right away if some system you use in the future doesn't work the same way.因此，只需按照您所说的方式实现它，如果它在您的平台上有效，并为其包含一些小单元测试，以便您立即知道您将来使用的某些系统是否以相同的方式工作。 After all, when this code is invoked, you're already screwed, so just do the best you can and move along.毕竟，当调用这段代码时，你已经搞砸了，所以尽你所能，继续前进。

A totally different alternative which is possible to use either at the same time or instead of your scheme, is to write a script to be invoked by Linux when a program dumps core.可以同时使用或代替您的方案的完全不同的替代方法是编写一个脚本，以便在程序转储核心时由 Linux 调用。 This script can then run gdb in batch mode on the core file to get the backtrace and send you an email or whatever.然后，此脚本可以在核心文件上以批处理模式运行 gdb 以获取回溯并向您发送 email 或其他任何内容。