简体   繁体   中英

Why does `script.py <(cat *.gz)` work with subprocess.Popen in python 2 but not python 3?

We discovered recently that a script we developed chokes in python 3.x (but not python 2.x) if it is supplied its input files via process substitution, eg:

script.py <(cat *.gz)

We've tested with commands other than gzip, such as cat, just to see if we get a similar error. They all complain that /dev/fd/63 (or /dev/fd/63.gz ) does not exist. Here's the (simplified) relevant bit of code:

def open_gzip_in(infile):
    '''Opens a gzip file for reading, using external gzip if available'''

    # Determine whether to use the gzip command line tool or not
    if exeExists('gzip'):
        cmd = ['gzip', '-dc', infile]
        p = subprocess.Popen(cmd, stdout=subprocess.PIPE, bufsize=-1,
                             universal_newlines=True)
        if sys.version.startswith("2"):
            with p.stdout:
                for line in iter(p.stdout.readline, b''):
                    yield line
        else:
            with p:
                for line in p.stdout:
                    yield line
        exit_code = p.wait()
        if exit_code != 0:
            raise subprocess.CalledProcessError(
                p.returncode, subprocess.list2cmdline(cmd), 'Ungzip failed')
    else:
        with io.TextIOWrapper(io.BufferedReader(gzip.open(infile))) as f:
            for line in f:
                yield(line)

Incidentally, we do the fork simply because the command line gzip is significantly faster than using gzip.open and our script is a long-running worker - the difference is multiple hours.

We are implementing a work-around for this issue, but would like to understand why it doesn't work in python 3 but does work in python 2.

This is a side effect of the new default Popen() -family argument close_fds=True . You can explicitly override it with close_fds=False , and your inherited file descriptors will be passed through to the child process (subject to configuration via os.set_inheritable() ).

Similarly, on Python 3.2 and later, you can use the pass_fds list, as in, pass_fds=[0,1,2,63] , to make stdin, stdout, stderr, and FD #63 available to the subprocess invoked.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM