简体   繁体   中英

Is output read from popen()ed FILE* complete before pclose()?

pclose() 's man page says:

The pclose() function waits for the associated process to terminate and returns the exit status of the command as returned by wait4(2).

I feel like this means if the associated FILE* created by popen() was opened with type "r" in order to read the command 's output, then you're not really sure the output has completed until after the call to pclose() . But after pclose() , the closed FILE* must surely be invalid, so how can you ever be certain you've read the entire output of command ?

To illustrate my question by example, consider the following code:

// main.cpp

#include <iostream>
#include <cstdio>
#include <cerrno>
#include <cstring>
#include <sys/types.h>
#include <sys/wait.h>

int main( int argc, char* argv[] )
{
  FILE* fp = popen( "someExecutableThatTakesALongTime", "r" );
  if ( ! fp )
  {
    std::cout << "popen failed: " << errno << " " << strerror( errno )
              << std::endl;
    return 1;
  }

  char buf[512] = { 0 };
  fread( buf, sizeof buf, 1, fp );
  std::cout << buf << std::endl;

  // If we're only certain the output-producing process has terminated after the
  // following pclose(), how do we know the content retrieved above with fread()
  // is complete?
  int r = pclose( fp );

  // But if we wait until after the above pclose(), fp is invalid, so
  // there's nowhere from which we could retrieve the command's output anymore,
  // right?

  std::cout << "exit status: " << WEXITSTATUS( r ) << std::endl;

  return 0;
}

My questions, as inline above: if we're only certain the output-producing child process has terminated after the pclose() , how do we know the content retrieved with the fread() is complete? But if we wait until after the pclose() , fp is invalid, so there's nowhere from which we could retrieve the command's output anymore, right?

This feels like a chicken-and-egg problem, but I've seen code similar to the above all over, so I'm probably misunderstanding something. I'm grateful for an explanation on this.

TL;DR executive summary: how do we know the content retrieved with the fread() is complete? — we've got an EOF.

You get an EOF when the child process closes its end of the pipe. This can happen when it calls close explicitly or exits. Nothing can come out of your end of the pipe after that. After getting an EOF you don't know whether the process has terminated, but you do know for sure that it will never write anything to the pipe.

By calling pclose you close your end of the pipe and wait for termination of the child. When pclose returns, you know that the child has terminated.

If you call pclose without getting an EOF, and the child tries to write stuff to its end of the pipe, it will fail (in fact it wil get a SIGPIPE and probably die).

There is absolutely no room for any chicken-and-egg situation here.

Read the documentation for popen more carefully:

The pclose() function shall close a stream that was opened by popen() , wait for the command to terminate , and return the termination status of the process that was running the command language interpreter.

It blocks and waits.

popen() is just a shortcut for series of fork, dup2, execv, fdopen, etc. It will give us access to child STDOUT, STDIN via files stream operation with ease.

After popen(), both the parent and the child process executed independently. pclose() is not a 'kill' function, its just wait for the child process to terminate. Since it's a blocking function, the output data generated during pclose() executed could be lost.

To avoid this data lost, we will call pclose() only when we know the child process was already terminated: a fgets() call will return NULL or fread() return from blocking, the shared stream reach the end and EOF() will return true.

Here is an example of using popen() with fread(). This function return -1 if the executing process is failed, 0 if Ok. The child output data is return in szResult.

int exec_command( const char * szCmd, std::string & szResult ){

    printf("Execute commande : [%s]\n", szCmd );

    FILE * pFile = popen( szCmd, "r");
    if(!pFile){
            printf("Execute commande : [%s] FAILED !\n", szCmd );
            return -1;
    }

    char buf[256];

    //check if the output stream is ended.
    while( !feof(pFile) ){

        //try to read 255 bytes from the stream, this operation is BLOCKING ...
        int nRead = fread(buf, 1, 255, pFile);

        //there are something or nothing to read because the stream is closed or the program catch an error signal
        if( nRead > 0 ){
            buf[nRead] = '\0';
            szResult += buf;
        }
    }

    //the child process is already terminated. Clean it up or we have an other zoombie in the process table.
    pclose(pFile); 

    printf("Exec command [%s] return : \n[%s]\n",  szCmd, szResult.c_str() );
    return 0;
}

Note that, all files operation on the return stream work on BLOCKING mode, the stream is open without O_NONBLOCK flags. The fread() can be blocked forever when the child process hang and nerver terminated, so use popen() only with trusted program.

To take more controls on child process and avoid the file blockings operation, we should use fork/vfork/execlv, etc. by ourself, modify the pipes opened attribut with O_NONBLOCK flags, use poll() or select() from time to time to determine if there are some data then use read() function to read from the pipe.

Use waitpid() with WNOHANG periodically to see if the child process was terminated.

I learned a couple things while researching this issue further, which I think answer my question:

Essentially: yes it is safe to fread from the FILE* returned by popen prior to pclose . Assuming the buffer given to fread is large enough, you will not "miss" output generated by the command given to popen .

Going back and carefully considering what fread does: it effectively blocks until ( size * nmemb ) bytes have been read or end-of-file (or error) is encountered.

Thanks to C - pipe without using popen , I understand better what popen does under the hood: it does a dup2 to redirect its stdout to the write-end of the pipe it uses. Importantly: it performs some form of exec to execute the specified command in the forked process, and after this child process terminates, its open file descriptors, including 1 ( stdout ) are closed . Ie termination of the specified command is the condition by which the child process' stdout is closed.

Next, I went back and thought more carefully about what EOF really was in this context. At first, I was under the loosey-goosey and mistaken impression that " fread tries to read from a FILE* as fast as it can and returns/unblocks after the last byte is read ". That's not quite true: as noted above: fread will read/block until its target number of bytes is read or EOF or error are encountered. The FILE* returned by popen comes from a fdopen of the read-end of the pipe used by popen , so its EOF occurs when the child process' stdout - which was dup2 ed with the write-end of the pipe - is closed .

So, in the end what we have is: popen creating a pipe whose write end gets the output of a child process running the specified command , and whose read end if fdopen ed to a FILE* passed to fread . (Assuming fread 's buffer is big enough), fread will block until EOF occurs, which corresponds to closure of the write end of popen 's pipe resulting from termination of the executing command . Ie because fread is blocking until EOF is encountered, and EOF occurs after command - running in popen 's child process - terminates, it's safe to use fread (with a sufficiently large buffer) to capture the complete output of the command given to popen .

Grateful if anyone can verify my inferences and conclusions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM