简体   繁体   中英

Threads and fork: fgetc() blocks when reading from popen()-pipe

In a multithreaded program (running on ARM) I have

a main thread which, among other things, periodically checks with popen( "pidof -s prog" ) whether another program is running. I use the O_CLOEXEC flag for the file descriptor and check whether fgetc() receives anything from the pipe. Setting the file descriptor "non-blocking" results in nothing being read and won't help . The same pidof command from the shell command is performing fine.

In another thread , fork() with an immediate execl() in the child process is used to start an rsync operation whenever a specific event occurs. The parent uses a signal handler to observe the child status, and has the option to kill the child at another specific event. It doesn't matter whether I invoke exec() with rsync or sleep - the result is the same.

The problem is that the fgetc() in the main thread blocks until the child process terminates.

I'll try to solve this problem by fork() ing early (at some point where the application is single-threaded, as supposed in another post which I started).

But anyway:

I'd like to understand what's causing the fgetc() to block when reading from the pipe.

A few things I've tried so far:

  • I tried to reproduce the problem with a small example application that does what I've described above and hoped it would show the same erroneous behaviour, but unfortunately it works fine, which is why I do not provide any code here yet. Maybe I'm missing the relevant point.
  • Using the same rsync invocation via system() doesn't cause any issues
  • I've had a look at a system() implementation and can see that the signals are manipulated before fork() ing:

    • SIGCHLD is blocked
    • SIGINT and SIGQUIT are ignored

    I need the signal handler for SIGCHLD, but out of curiousity I tried to do the same as in the code from above (I replaced sigprocmask() with pthread_sigmask() ) - without any success, the behaviour stays the same.

    I couldn't find any implementation of system() in the sources provided with my BSP.

The program opens other files via fstream - and without O_CLOEXEC (will be a bit cumbersome to change that )

Bugfix and explanation of unexpected behaviour

Indeed I've missed the relevant point. After adapting the sample program more to the original code example I've seen that the signal handler (which worked in a test program) was the issue. Excerpt:

void MyClass::sig_handler(int sig) {
    if( m_pid < 1 ) // not the child we're waiting for
        return;

    pid_t pid;
    int wstatus;

    while ((pid = waitpid( -1, &wstatus, WNOHANG )) != -1 ) {
        // error: this returns 0 as long as any children are alive
        // -> check for "> 0" to ignore active child processes
        if( pid != m_pid )
            return;
        // handle stuff here...
    }
}

I had to replace the following line

while ((pid = waitpid( -1, &wstatus, WNOHANG )) != -1 )

with

while ((pid = waitpid( -1, &wstatus, WNOHANG )) > 0 )

because the program's other threads fork() children (eg with popen() ). If those terminate, the signal handler (a static class function) is invoked, too.

As I understand:

In the thread where I invoke fork() , I use a member m_pid with default and reset value -1 . It takes the pid from fork() . The sig handler immediately returns if m_pid is -1.

The program blocked at popen() which fork() s (could be any other call that fork() s). Thus the signal handler for SIGCHLD is entered when popen() returns. The check for m_pid is passed as m_pid = fork() has been invoked. waitpid() does not return -1 but the pid of the popen() child, and then keeps checking with return value = 0 until all children have terminated - the one I'm waiting for is still alive! Only then waitpid() returns with -1 and the main thread can continue reading with fgetc() .

Man page from waitpid :

if WNOHANG was specified and one or more child(ren) specified by pid exist, but have not yet changed state, then 0 is returned. On error, -1 is returned

Because the sig handler checks for m_pid != -1 , the problem only occurred when I used fork() in MyClass to set the m_pid .

That's why using system() did not cause the problem. The m_pid is not set to a value != -1, thus the sig handler immediately returns if eg a child is popen() ed in the main thread.

The imitation of the system() invocation failed because I've set the m_pid at fork() , thus the sig handler did not return immediately.

I guess since the sig handler is a static member function , the handler blocks the very thread that fork() ed a child process.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM