简体   繁体   English

waitpid - WIFEXITED返回0虽然孩子正常退出

[英]waitpid - WIFEXITED returning 0 although child exited normally

I have been writing a program that spawns a child process, and calls waitpid to wait for the termination of the child process. 我一直在编写一个生成子进程的程序,并调用waitpid来等待子进程的终止。 The code is below: 代码如下:

  // fork & exec the child
  pid_t pid = fork();
  if (pid == -1)
    // here is error handling code that is **not** triggered

  if (!pid)
    {
      // binary_invocation is an array of the child process program and its arguments
      execv(args.binary_invocation[0], (char * const*)args.binary_invocation);
      // here is some error handling code that is **not** triggered
    }
  else
    {
      int status = 0;
      pid_t res = waitpid(pid, &status, 0);

      // here I see pid_t being a positive integer > 0
      // and status being 11, which means WIFEXITED(status) is 0.
      // this triggers a warning in my programs output.
    }

The manpage of waitpid states for WIFEXITED : 的手册页waitpidWIFEXITED

WIFEXITED(status)
    returns  true  if  the child terminated normally, that is, by calling exit(3) or
    _exit(2), or by returning from main().

Which I intepret to mean it should return an integer != 0 on success, which is not happening in the execution of my program, since I observe WIFEXITED(status) == 0 我的意思是它应该返回一个整数!= 0成功,这在我的程序执行中没有发生,因为我观察到WIFEXITED(status) == 0

However, executing the same program from the command line results in $? == 0 但是,从命令行执行相同的程序会导致$? == 0 $? == 0 , and starting from gdb results in: $? == 0 ,从gdb开始导致:

[Inferior 1 (process 31934) exited normally]

The program behaves normally, except for the triggered warning, which makes me think something else is going on here, that I am missing. 程序运行正常,除了触发警告,这让我觉得其他事情正在发生,我不知道。

EDIT: 编辑:
as suggested below in the comments, I checked if the child is terminated via segfault, and indeed, WIFSIGNALED(status) returns 1, and WTERMSIG(status) returns 11, which is SIGSEGV . 正如下面评论中所建议的那样,我检查了子WIFSIGNALED(status)是否通过段错误终止,实际上, WIFSIGNALED(status)返回1, WTERMSIG(status)返回11,即SIGSEGV

What I don't understand though, is why a call via execv would fail with a segfault while the same call via gdb, or a shell would succeed? 我不明白的是,为什么通过execv调用会因为段错误而失败,而通过gdb进行相同的调用,或者shell会成功?

EDIT2: EDIT2:
The behaviour of my application heavily depends on the behaviour of the child process, in particular on a file the child writes in a function declared __attribute__ ((destructor)) . 我的应用程序的行为在很大程度上取决于子进程的行为,特别是在子进程在__attribute__ ((destructor))声明的函数中写入的文件中__attribute__ ((destructor)) After the waitpid call returns, this file exists and is generated correctly which means the segfault occurs somewhere in another destructor, or somewhere outside of my control. waitpid调用返回之后, 此文件存在并正确生成,这意味着段错误发生在另一个析构函数中的某个位置,或者在我控制之外的某个位置。

On Unix and Linux systems, the status returned from wait or waitpid (or any of the other wait variants) has this structure: 在Unix和Linux系统上, waitwaitpid (或任何其他wait变体)返回的状态具有以下结构:

bits   meaning

0-6    signal number that caused child to exit,
       or 0177 if child stopped / continued
       or zero if child exited without a signal

 7     1 if core dumped, else 0

8-15   low 8 bits of value passed to _exit/exit or returned by main,
       or signal that caused child to stop/continue

(Note that Posix doesn't define the bits, just macros, but these are the bit definitions used by at least Linux, Mac OS X/iOS, and Solaris. Also note that waitpid only returns for stop events if you pass it the WUNTRACED flag and for continue events if you pass it the WCONTINUED flag.) (请注意,Posix不定义位,只定义宏,但这些是至少Linux,Mac OS X / iOS和Solaris使用的位定义。另请注意,如果您通过了WUNTRACED ,则waitpid仅返回停止事件如果您传递WCONTINUED标志,则标记和继续事件。)

So a status of 11 means the child exited due to signal 11, which is SIGSEGV (again, not Posix but conventionally). 所以11的状态意味着孩子因信号11退出,这是SIGSEGV (同样,不是Posix,而是传统的)。

Either your program is passing invalid arguments to execv (which is a C library wrapper around execve or some other kernel-specific call), or the child runs differently when you execv it and when you run it from the shell or gdb. 您的程序是将无效参数传递给execv (这是一个围绕execve的C库包装或其他一些特定于内核的调用),或者当您执行它时以及从shell或gdb运行它时,子execv运行方式不同。

If you are on a system that supports strace , run your (parent) program under strace -f to see whether execv is causing the signal. 如果您使用的是支持strace的系统,请在strace -f下运行您的(父)程序,以查看execv是否导致信号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM