简体   繁体   English

自动在Unix系统上获取堆栈跟踪

[英]Getting stack traces on Unix systems, automatically

What methods are there for automatically getting a stack trace on Unix systems? 有哪些方法可以在Unix系统上自动获取堆栈跟踪? I don't mean just getting a core file or attaching interactively with GDB, but having a SIGSEGV handler that dumps a backtrace to a text file. 我的意思不是只获取核心文件或与GDB交互附加,而是拥有将回溯记录转储到文本文件的SIGSEGV处理程序。

Bonus points for the following optional features: 以下可选功能的加分:

  • Extra information gathering at crash time (eg. config files). 崩溃时收集的其他信息(例如配置文件)。
  • Email a crash info bundle to the developers. 通过电子邮件将崩溃信息包发送给开发人员。
  • Ability to add this in a dlopen ed shared library 可以将其添加到dlopen共享库中
  • Not requiring a GUI 不需要GUI

FYI, 仅供参考

the suggested solution (using backtrace_symbols in a signal handler) is dangerously broken. 建议的解决方案(在信号处理程序中使用backtrace_symbols)很危险。 DO NOT USE IT - 请勿使用-

Yes, backtrace and backtrace_symbols will produce a backtrace and a translate it to symbolic names, however: 是的,backtrace和backtrace_symbols将产生一个backtrace并将其转换为符号名称,但是:

  1. backtrace_symbols allocates memory using malloc and you use free to free it - If you're crashing because of memory corruption your malloc arena is very likely to be corrupt and cause a double fault. backtrace_symbols使用malloc分配内存,您可以使用free释放它-如果由于内存损坏而崩溃,则您的malloc竞技场很可能会损坏并导致双重错误。

  2. malloc and free protect the malloc arena with a lock internally. malloc和free通过内部锁定来保护malloc竞技场。 You might have faulted in the middle of a malloc/free with the lock taken, which will cause these function or anything that calls them to dead lock. 您可能在malloc / free的中间发生了错误并获得了锁定,这将导致这些函数或任何将其调用为死锁的事情。

  3. You use puts which uses the standard stream, which is also protected by a lock. 您使用的puts使用标准流,该流也受锁保护。 If you faulted in the middle of a printf you once again have a deadlock. 如果您在printf的中间出现故障,您将再次陷入僵局。

  4. On 32bit platforms (eg your normal PC of 2 year ago), the kernel will plant a return address to an internal glibc function instead of your faulting function in your stack, so the single most important piece of information you are interested in - in which function did the program fault, will actually be corrupted on those platform. 在32位平台(例如2年前的普通PC)上,内核会将返回地址植入内部glibc函数中,而不是将错误函数植入堆栈中,因此您感兴趣的是一条最重要的信息-在其中函数确实造成了程序故障,实际上将在那些平台上被破坏。

So, the code in the example is the worst kind of wrong - it LOOKS like it's working, but it will really fail you in unexpected ways in production. 因此,该示例中的代码是最严重的错误-看起来像正在工作,但实际上会使生产中的意外失败。

BTW, interested in doing it right? 顺便说一句,有兴趣做对吗? check this out. 检查出。

Cheers, Gilad. 干杯,吉拉德。

If you are on systems with the BSD backtrace functionality available (Linux, OSX 1.5, BSD of course), you can do this programmatically in your signal handler. 如果您使用的系统具有BSD backtrace功能(当然是Linux,OSX 1.5,BSD),则可以在信号处理程序中以编程方式执行此操作。

For example ( backtrace code derived from IBM example ): 例如( 从IBM示例派生的backtrace代码 ):

#include <execinfo.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>

void sig_handler(int sig)
{
    void * array[25];
    int nSize = backtrace(array, 25);
    char ** symbols = backtrace_symbols(array, nSize);

    for (int i = 0; i < nSize; i++)
    {
        puts(symbols[i]);;
    }

    free(symbols);

    signal(sig, &sig_handler);
}

void h()
{
    kill(0, SIGSEGV);
}

void g()
{
    h();
}

void f()
{
    g();
}

int main(int argc, char ** argv)
{
    signal(SIGSEGV, &sig_handler);
    f();
}

Output: 输出:

0   a.out                               0x00001f2d sig_handler + 35
1   libSystem.B.dylib                   0x95f8f09b _sigtramp + 43
2   ???                                 0xffffffff 0x0 + 4294967295
3   a.out                               0x00001fb1 h + 26
4   a.out                               0x00001fbe g + 11
5   a.out                               0x00001fcb f + 11
6   a.out                               0x00001ff5 main + 40
7   a.out                               0x00001ede start + 54

This doesn't get bonus points for the optional features (except not requiring a GUI), however, it does have the advantage of being very simple, and not requiring any additional libraries or programs. 这不会为可选功能获得加分(除非不需要GUI),但是,它的确具有非常简单的优势,并且不需要任何其他库或程序。

Here is an example of how to get some more info using a demangler. 这是一个如何使用分解器获得更多信息的示例。 As you can see this one also logs the stacktrace to file. 如您所见,这还将堆栈跟踪记录到文件中。

#include <iostream>
#include <sstream>
#include <string>
#include <fstream>
#include <cxxabi.h>

void sig_handler(int sig)
{
    std::stringstream stream;
    void * array[25];
    int nSize = backtrace(array, 25);
    char ** symbols = backtrace_symbols(array, nSize);
    for (unsigned int i = 0; i < size; i++) {
        int status;
        char *realname;
        std::string current = symbols[i];
        size_t start = current.find("(");
        size_t end = current.find("+");
        realname = NULL;
        if (start != std::string::npos && end != std::string::npos) {
            std::string symbol = current.substr(start+1, end-start-1);
            realname = abi::__cxa_demangle(symbol.c_str(), 0, 0, &status);
        }
        if (realname != NULL)
            stream << realname << std::endl;
        else
            stream << symbols[i] << std::endl;
        free(realname);
    }
    free(symbols);
    std::cerr << stream.str();
    std::ofstream file("/tmp/error.log");
    if (file.is_open()) {
        if (file.good())
            file << stream.str();
        file.close();
    }
    signal(sig, &sig_handler);
}

Dereks solution is probably the best, but here's an alternative anyway: Dereks解决方案可能是最好的,但是无论如何这是一个替代方案:

Recent Linux kernel version allow you to pipe core dumps to a script or program. 最新的Linux内核版本允许您将核心转储通过管道传递到脚本或程序。 You could write a script to catch the core dump, collect any extra information you need and mail everything back. 您可以编写脚本来捕获核心转储,收集所需的任何其他信息,然后将所有内容邮寄回去。 This is a global setting though, so it'd apply to any crashing program on the system. 但是,这是一个全局设置,因此它适用于系统上所有崩溃的程序。 It will also require root rights to set up. 它还将需要root权限来设置。 It can be configured through the /proc/sys/kernel/core_pattern file. 可以通过/ proc / sys / kernel / core_pattern文件进行配置。 Set that to something like ' | 将其设置为“ | /home/myuser/bin/my-core-handler-script'. / home / myuser / bin / my-core-handler-script”。

The Ubuntu people use this feature as well. Ubuntu人也使用此功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM