简体   繁体   中英

Formatting a command in python subprocess popen

I am trying to format the following awk command

awk -v OFS="\t" '{printf "chr%s\t%s\t%s\n", $1, $2-1, $2}' file1.txt > file2.txt

for use in python subprocess popen. However i am having a hard time formatting it. I have tried solutions suggested in similar answers but none of them worked. I have also tried using raw string literals. Also i would not like to use shell=True as this is not recommended

Edit according to comment: The command i tried was

awk_command = """awk -v OFS="\t" '{printf "chr%s\t%s\t%s\n", $1, $2-1, $2}' file1.txt > file2.txt"""
command_execute = Popen(shlex.split(awk_command))

However i get the following error upon executing this

KeyError: 'printf "chr%s\t%s\t%s\n", $1, $2-1, $2'

googling the error suggests this happens when a value is requested for an undefined key but i do not understand its context here

> is the shell redirection operator. To implement it in Python, use stdout parameter:

#!/usr/bin/env python
import shlex
import subprocess

cmd = r"""awk -v OFS="\t" '{printf "chr%s\t%s\t%s\n", $1, $2-1, $2}'"""
with open('file2.txt', 'wb', 0) as output_file:
    subprocess.check_call(shlex.split(cmd) + ["file1.txt"], stdout=output_file)

To avoid starting a separate process, you could implement this particular awk command in pure Python.

  1. The simplest method, especially if you wish to keep the output redirection stuff, is to use subprocess with shell=True - then you only need to escape Python special characters. The line, as a whole, will be interpreted by the default shell.

    • WARNING: do not use this with untrusted input without sanitizing it first!
  2. Alternatively, you can replace the command line with an argv -type sequence and feed that to subprocess instead. Then, you need to provide stuff as the program would see it:

    • remove all the shell-level escaping
    • remove the output redirection stuff and do the redirection yourself instead

Regarding the specific problems:

  • you didn't escape Python special characters in the string so \\t and \\n became the literal tab and newline (try to print awk_command )
  • using shlex.split is nothing different from shell=True - with an added unreliability since it cannot guarantee if would parse the string the same way your shell would in every case (not to mention the lack of transmutations the shell makes).

    • Specifically, it doesn't know or care about the special meaning of the redirection part:

       >>> awk_command = """awk -v OFS="\\\\t" '{printf "chr%s\\\\t%s\\\\t%s\\\\n", $1, $2- 1, $2}' file1.txt > file2.txt""" >>> shlex.split(awk_command) ['awk','-v','OFS=\\\\t','{printf "chr%s\\\\t%s\\\\t%s\\\\n", $1, $2-1, $2}','file1.txt','>','file2.txt'] 

So, if you wish to use shell=False , do construct the argument list yourself.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM