简体   繁体   English

如何报告分段错误?

[英]How are segmentation faults reported?

I was just wondering how segmentation faults might get reported. 我只是想知道如何报告分段错误。

  • The process will just die, so obviously it cannot report it. 这个过程就会死掉,显然它无法报告。
  • The shell would not know for sure unless the process passes a signal, which might not be the case necessarily. 除非进程传递信号,否则shell无法确定,这可能不是必然的。
  • The OS might be able to do something, but I am not sure how. 操作系统可能会做一些事情,但我不知道如何做。

Which one of these reports the segmentation faults (just an example), and how? 其中哪一个报告了分段错误(只是一个例子),以及如何?

The process will just die, so obviously it cannot report it. 这个过程就会死掉,显然它无法报告。

This is actually false. 这实际上是错误的。 It is possible to install a SIGSEGV handler to replace the default one, which simply dumps core and dies. 可以安装一个SIGSEGV处理程序以取代默认的,它只是核心转储和死亡。 A preload library can do so to catch a segmentation violation and use the limited facilities available to notify another process running on the system of what has happened before exiting. 预加载库可以执行此操作以捕获分段违规并使用可用的有限设施来通知系统上运行的另一个进程,然后退出。

If you take a look at the functions wait() or waitpid() , you will find that one of the bits in the exit status indicates a core dump. 如果查看函数wait()waitpid() ,您会发现退出状态中的一个位表示核心转储。 The POSIX specification mentions WIFSIGNALED [sic] and WTERMSIG to get the signal that terminated the process. POSIX规范提到了WIFSIGNALED [原文如此]和WTERMSIG来获取终止该过程的信号。 The POSIX specification doesn't mention it, but on Mac OS X (10.7.4) for example, there's a WCOREDUMP() macro to test whether a core file was created. POSIX规范没有提到它,但是在Mac OS X( WCOREDUMP() ,有一个WCOREDUMP()宏来测试核心文件是否被创建。

You can have some code like this which will call the GDB command to dump the call trace: 你可以有一些代码像这样它会调用GDB命令转储呼叫跟踪:

void BacktraceOnSegv() {
  struct sigaction action = {};
  action.sa_handler = DumpBacktrace;
  if (sigaction(SIGSEGV, &action, NULL) < 0) {
    perror("sigaction(SEGV)");
  }
}

void DumpBacktrace(int) {
  pid_t dying_pid = getpid();
  pid_t child_pid = fork();
  if (child_pid < 0) {
    perror("fork() while collecting backtrace:");
  } else if (child_pid == 0) {
    char buf[1024];
    sprintf(buf, "gdb -p %d -batch -ex bt 2>/dev/null | "
            "sed '0,/<signal handler/d'", dying_pid);
    const char* argv[] = {"sh", "-c", buf, NULL};
    execve("/bin/sh", (char**)argv, NULL);
    _exit(1);
  } else {
    waitpid(child_pid, NULL, 0);
  }
  _exit(1);
}

Here is an implementation that support more platforms. 是一个支持更多平台的实现。

okay, to start with, a segmentation fault happens when the CPU attempts to access an address to which the process doesn't have access. 好吧,首先,当CPU尝试访问进程无权访问的地址时,会发生分段错误。 At the lowest level, the implementation of memory mapping has to detect that, which in general produces an interrupt. 在最低级别,内存映射的实现必须检测到,这通常会产生中断。 The kernel receives that interrupt, and has a table of addresses of other segments of code, each of which is intended to handle that interrupt. 内核接收该中断,并有一个其他代码段的地址表,每个代码段用于处理该中断。

When the kernel receives that interrupt, it translates it into a specific value (I'm being vague because the exact details vary both with hardware architecture and kernel implementation). 当内核收到该中断时,它会将其转换为特定值(我很模糊,因为具体细节因硬件架构和内核实现而异)。 SIGSEGV is usually defined to have the value 11, but the exact value isn't important; SIGSEGV通常定义为值11,但确切的值并不重要; it's defined in signal.h . 它在signal.hsignal.h

At that point, the signal value is passed to another table inside the kernel, which contains the addresses of "signal handlers". 此时,信号值被传递到内核中的另一个表,该表包含“信号处理程序”的地址。 One of those handlers is at the offset represented by SIGSEGV . 其中一个处理程序位于SIGSEGV代表的偏移处。 Unless you have done something to change it, that address is usually of a routine that causes a core dump, assuming the appropriate limits permit, but you can replace that with the address of your own routine, which can do anything you like. 除非您已经做了一些改变,否则该地址通常是导致核心转储的例程,假设适当的限制允许,但您可以用您自己的例程地址替换它,这可以做任何您喜欢的事情。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM