[英]waitpid/wexitstatus returning 0 instead of correct return code
I have the helper function below, used to execute a command and get the return value on posix systems. 我有下面的帮助函数,用于执行命令并在posix系统上获取返回值。 I used to use
popen
, but it is impossible to get the return code of an application with popen
if it runs and exits before popen
/ pclose
gets a chance to do its work. 我曾经使用过
popen
,但是如果它运行并且在popen
/ pclose
有机会完成其工作之前退出,则无法获得具有popen
的应用程序的返回代码。
The following helper function creates a process fork, uses execvp
to run the desired external process, and then the parent uses waitpid
to get the return code. 以下辅助函数创建进程fork,使用
execvp
运行所需的外部进程,然后父进程使用waitpid
获取返回代码。 I'm seeing odd cases where it's refusing to run. 我看到奇怪的情况,它拒绝运行。
When called with wait
= true
, waitpid
should return the exit code of the application no matter what. 当使用
wait
= true
调用时,无论如何, waitpid
都应该返回应用程序的退出代码。 However, I'm seeing stdout
output that specifies the return code should be non-zero, yet the return code is zero. 但是,我看到
stdout
输出指定返回代码应该是非零,但返回代码为零。 Testing the external process in a regular shell, then echo
ing $?
在常规shell中测试外部进程,然后
echo
$?
returns non-zero, so it's not a problem w/ the external process not returning the right code. 返回非零,因此外部进程没有返回正确的代码也不是问题。 If it's of any help, the external process being run is
mount(8)
(yes, I know I can use mount(2)
but that's besides the point). 如果有任何帮助,正在运行的外部进程是
mount(8)
(是的,我知道我可以使用mount(2)
但是除了这一点之外)。
I apologize in advance for a code dump. 我提前为代码转储道歉。 Most of it is debugging/logging:
大多数是调试/日志记录:
inline int ForkAndRun(const std::string &command, const std::vector<std::string> &args, bool wait = false, std::string *output = NULL)
{
std::string debug;
std::vector<char*> argv;
for(size_t i = 0; i < args.size(); ++i)
{
argv.push_back(const_cast<char*>(args[i].c_str()));
debug += "\"";
debug += args[i];
debug += "\" ";
}
argv.push_back((char*)NULL);
neosmart::logger.Debug("Executing %s", debug.c_str());
int pipefd[2];
if (pipe(pipefd) != 0)
{
neosmart::logger.Error("Failed to create pipe descriptor when trying to launch %s", debug.c_str());
return EXIT_FAILURE;
}
pid_t pid = fork();
if (pid == 0)
{
close(pipefd[STDIN_FILENO]); //child isn't going to be reading
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO);
if (execvp(command.c_str(), &argv[0]) != 0)
{
exit(EXIT_FAILURE);
}
return 0;
}
else if (pid < 0)
{
neosmart::logger.Error("Failed to fork when trying to launch %s", debug.c_str());
return EXIT_FAILURE;
}
else
{
close(pipefd[STDOUT_FILENO]);
int exitCode = 0;
if (wait)
{
waitpid(pid, &exitCode, wait ? __WALL : (WNOHANG | WUNTRACED));
std::string result;
char buffer[128];
ssize_t bytesRead;
while ((bytesRead = read(pipefd[STDIN_FILENO], buffer, sizeof(buffer)-1)) != 0)
{
buffer[bytesRead] = '\0';
result += buffer;
}
if (wait)
{
if ((WIFEXITED(exitCode)) == 0)
{
neosmart::logger.Error("Failed to run command %s", debug.c_str());
neosmart::logger.Info("Output:\n%s", result.c_str());
}
else
{
neosmart::logger.Debug("Output:\n%s", result.c_str());
exitCode = WEXITSTATUS(exitCode);
if (exitCode != 0)
{
neosmart::logger.Info("Return code %d", (exitCode));
}
}
}
if (output)
{
result.swap(*output);
}
}
close(pipefd[STDIN_FILENO]);
return exitCode;
}
}
Note that the command is run OK with the correct parameters, the function proceeds without any problems, and WIFEXITED
returns TRUE
. 请注意,使用正确的参数运行命令,函数继续运行没有任何问题,
WIFEXITED
返回TRUE
。 However, WEXITSTATUS
returns 0, when it should be returning something else. 但是,
WEXITSTATUS
返回0,应该返回其他内容。
Probably isn't your main issue, but I think I see a small problem. 可能不是你的主要问题,但我认为我看到一个小问题。 In your child process, you have...
在你的孩子过程中,你有......
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO); //but wait, this pipe is closed!
But I think what you want is: 但我认为你想要的是:
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd for both, can close
I don't have much experience with forks and pipes in Linux, but I did write a similar function pretty recently. 我对Linux中的分支和管道没有多少经验,但我最近写了一个类似的功能。 You can take a look at the code to compare, if you'd like.
如果您愿意,可以查看要比较的代码。 I know that my function works.
我知道我的功能有效。
I'm using the mongoose library, and grepping my code for SIGCHLD
revealed that using mg_start
from mongoose results in setting SIGCHLD
to SIG_IGN
. 我使用的猫鼬图书馆和grepping我的代码
SIGCHLD
透露,使用mg_start
在制定从猫鼬结果SIGCHLD
到SIG_IGN
。
From the waitpid
man page , on Linux a SIGCHLD
set to SIG_IGN
will not create a zombie process, so waitpid
will fail if the process has already successfully run and exited - but will run OK if it hasn't yet. 在
waitpid
手册页中 ,在Linux 上 ,设置为SIG_IGN
的SIGCHLD
不会创建僵尸进程,因此如果进程已成功运行并退出,则waitpid
将失败 - 但如果尚未运行则将运行OK。 This was the cause of the sporadic failure of my code. 这是我的代码零星失败的原因。
Simply re-setting SIGCHLD
after calling mg_start
to a void function that does absolutely nothing was enough to keep the zombie records from being immediately erased. 在将
mg_start
调用为无效函数后,只需重新设置SIGCHLD
完成任何操作,足以防止僵尸记录被立即删除。
Per @Geoff_Montee's advice , there was a bug in my redirect of STDERR
, but this was not responsible for the problem as execvp
does not store the return value in STDERR
or even STDOUT
, but rather in the kernel object associated with the parent process (the zombie record). 根据@ Geoff_Montee的建议 ,我的
STDERR
重定向中存在一个错误,但这不是问题的原因,因为execvp
不会将返回值存储在STDERR
甚至STDOUT
,而是存储在与父进程关联的内核对象中(僵尸记录)。
@jilles' warning about non-contiguity of vector
in C++ does not apply for C++03 and up (only valid for C++98, though in practice, most C++98 compilers did use contiguous storage, anyway) and was not related to this issue. @jilles关于C ++中
vector
非连续性的警告不适用于C ++ 03及更高版本(仅对C ++ 98有效,但在实践中,大多数C ++ 98编译器确实使用了连续存储),并且与此问题无关。 However, the advice on reading from the pipe before blocking and checking the output of waitpid
is spot-on. 但是,在阻塞和检查
waitpid
的输出之前从管道读取的建议是正确的。
I've found that pclose
does NOT block and wait for the process to end, contrary to the documentation (this is on CentOS 6). 我发现
pclose
不会阻塞并等待进程结束,这与文档相反(这是在CentOS 6上)。 I've found that I need to call pclose
and then call waitpid(pid,&status,0);
我发现我需要调用
pclose
然后调用waitpid(pid,&status,0);
to get the true return value. 获得真正的回报价值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.