简体   繁体   中英

Python redirection in subprocess.Popen

I want to use a bash command using python's subprocess.Popen . My bash command looks like:

$ gunzip -c /my/dir/file1.gz /my/dir/file2.gz | gsplit -l 500000 --numerical-suffixes=1 --suffix-length=3 --additional-suffix=.split - /my/dir/output/file_

It takes compressed files, uncompresses them, merges the content, splits the content into output files. I can do that in Python this way:

from __future__ import print_function
import subprocess
dir = "/my/dir"
files = ["file1.gz", "file2.gz"]

cmd1 = "gunzip -c {}".format(" ".join([dir+files[0], dir+files[1]]))
cmd2 = "{} -l {} --numeric-suffixes={} --suffix-length={} --additional-suffix={}  - {}"\
        .format("gsplit", 500000, 1, 3, ".split"#, "'gzip > $FILE.gz'"
                , "/my/dir/output/file_")

proc1 = subprocess.Popen(str(cmd1).split(), stdout=subprocess.PIPE)
proc2 = subprocess.Popen(str(cmd2).split(), stdin=proc1.stdout, stdout=subprocess.PIPE)

proc1.stdout.close()
proc2.wait()
print("result:", proc2.returncode)

Then I can check the output:

$ ls /my/dir/output
file_001.split
file_002.split
file_003.split

Now I want to make use of the gsplit's --filter argument, which allows to pipe the result to another command. Here, I chose gzip as I want to compress the output. Bash command looks like this:

$ gunzip -c /my/dir/file1.gz /my/dir/file2.gz | gsplit -l 500000 --numerical-suffixes=1 --suffix-length=3 --additional-suffix=.split --filter='gzip > $FILE.gz' - /my/dir/output/file_

This command works.

Now putting it into python code:

from __future__ import print_function
import subprocess
dir = "/my/dir"
files = ["file1.gz", "file2.gz"]

cmd1 = "gunzip -c {}".format(" ".join([dir+files[0], dir+files[1]]))
cmd2 = "{} -l {} --numeric-suffixes={} --suffix-length={} --additional-suffix={}  --filter={} - {}"\
        .format("gsplit", 500000, 1, 3, ".split", "'gzip > $FILE.gz'"
                , "/my/dir/output/file_")

proc1 = subprocess.Popen(str(cmd1).split(), stdout=subprocess.PIPE)
proc2 = subprocess.Popen(str(cmd2).split(), stdin=proc1.stdout, stdout=subprocess.PIPE)

proc1.stdout.close()
proc2.wait()
print("result:", proc2.returncode)

Alas I get this error:

/usr/local/bin/gsplit: invalid option -- 'f'

Try '/usr/local/bin/gsplit --help' for more information.

gunzip: error writing to output: Broken pipe

gunzip: /my/dir/file1.gz: uncompress failed

gunzip: error writing to output: Broken pipe

gunzip: /my/dir/file12.gz: uncompress failed

I think it has to do with the redirection symbol in gzip > $FILE.gz .

What is going on, how can I resolve this issue?

str.split() isn't the appropriate function to convert a command-line string into an array of arguments. To see why, try:

print(str(cmd2).split())

Notice that "'gzip > , and $FILE.gz'" are in distinct arguments.

Try:

#UNTESTED
proc2 = subprocess.Popen(shlex.split(cmd2), stdin=proc1.stdout, stdout=subprocess.PIPE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM