如何从python传递许多bash命令？

Question

Hi I'm trying to call the following command from python: 嗨，我正在尝试从python调用以下命令：

comm -3 <(awk '{print $1}' File1.txt | sort | uniq) <(awk '{print $1}' File2.txt | sort | uniq) | grep -v "#" | sed "s/\t//g"

How could I do the calling when the inputs for the comm command are also piped? 当comm命令的输入也通过管道传输时，如何进行调用？

Is there an easy and straight forward way to do it? 有没有简单而直接的方法呢？

I tried the subprocess module: 我尝试了子流程模块：

subprocess.call("comm -3 <(awk '{print $1}' File1.txt | sort | uniq) <(awk '{print $1}' File2.txt | sort | uniq) | grep -v '#' | sed 's/\t//g'")

Without success, it says: OSError: [Errno 2] No such file or directory 没有成功，它说：OSError：[Errno 2]没有这样的文件或目录

Or do I have to create the different calls individually and then pass them using PIPE as it is described in the subprocess documentation: 还是我必须单独创建不同的调用，然后使用PIPE传递它们，如子流程文档中所述：

p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close()  # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]

Answer 1

Process substitution ( <() ) is bash-only functionality. 进程替换（ <() ）是仅bash的功能。 Thus, you need a shell, but it can't be just any shell (like /bin/sh , as used by shell=True on non-Windows platforms) -- it needs to be bash . 因此，您需要一个shell，但它不能仅仅是任何shell（例如/bin/sh ，在非Windows平台上被shell=True使用），它必须是bash 。

subprocess.call(['bash', '-c', "comm -3 <(awk '{print $1}' File1.txt | sort | uniq) <(awk '{print $1}' File2.txt | sort | uniq) | grep -v '#' | sed 's/\t//g'"])

By the way, if you're going to be going this route with arbitrary filenames, pass them out-of-band (as below: Passing _ as $0 , File1.txt as $1 , and File2.txt as $2 ): 顺便说一句，如果你打算是想与任意文件名的这条路线，传递出来的带外（如下图所示：传递_为$0 ， File1.txt为$1 ，和File2.txt为$2 ）：

subprocess.call(['bash', '-c',
  '''comm -3 <(awk '{print $1}' "$1" | sort | uniq) '''
  '''        <(awk '{print $1}' "$2" | sort | uniq) '''
  '''        | grep -v '#' | tr -d "\t"''',
  '_', "File1.txt", "File2.txt"])

That said, the best-practices approach is indeed to set up the chain yourself. 也就是说，最佳实践方法的确是您自己建立链。 The below is tested with Python 3.6 (note the need for the pass_fds argument to subprocess.Popen to make the file descriptors referred to via /dev/fd/## links available): 下面是与Python 3.6测试（注意需要对pass_fds参数subprocess.Popen使简称通过文件描述符/dev/fd/##提供链接）：

awk_filter='''! /#/ && !seen[$1]++ { print $1 }'''

p1 = subprocess.Popen(['awk', awk_filter],
                      stdin=open('File1.txt', 'r'),
                      stdout=subprocess.PIPE)
p2 = subprocess.Popen(['sort', '-u'],
                      stdin=p1.stdout,
                      stdout=subprocess.PIPE)
p3 = subprocess.Popen(['awk', awk_filter],
                      stdin=open('File2.txt', 'r'),
                      stdout=subprocess.PIPE)
p4 = subprocess.Popen(['sort', '-u'],
                      stdin=p3.stdout,
                      stdout=subprocess.PIPE)
p5 = subprocess.Popen(['comm', '-3',
                       ('/dev/fd/%d' % (p2.stdout.fileno(),)),
                       ('/dev/fd/%d' % (p4.stdout.fileno(),))],
                      pass_fds=(p2.stdout.fileno(), p4.stdout.fileno()),
                      stdout=subprocess.PIPE)
p6 = subprocess.Popen(['tr', '-d', '\t'],
                      stdin=p5.stdout,
                      stdout=subprocess.PIPE)
result = p6.communicate()

This is a lot more code, but (assuming that the filenames are parameterized in the real world) it's also safer code -- you aren't vulnerable to bugs like ShellShock that are triggered by the simple act of starting a shell, and don't need to worry about passing variables out-of-band to avoid injection attacks (except in the context of arguments to commands -- like awk -- that are scripting language interpreters themselves). 这是很多代码，但是（假设文件名在现实世界中已被参数化），它也是更安全的代码-您不容易受到像ShellShock这样的由启动shell的简单动作触发的bug的攻击，并且不要无需担心会带外传递变量以避免注入攻击（除非是脚本语言解释程序本身的命令参数（如awk的上下文中）。

That said, another thing to think about is just implementing the whole thing in native Python. 也就是说，要考虑的另一件事是仅在本机Python中实现整个事情。

lines_1 = set(line.split()[0] for line in open('File1.txt', 'r') if not '#' in line)
lines_2 = set(line.split()[0] for line in open('File2.txt', 'r') if not '#' in line)
not_common = (lines_1 - lines_2) | (lines_2 - lines_1)
for line in sorted(not_common):
  print line

Answer 2

Also checkout plumbum. 还结帐铅。 Makes life easier 让生活更轻松

http://plumbum.readthedocs.io/en/latest/ http://plumbum.readthedocs.io/en/latest/

Pipelining 流水线

This may be wrong, but you can try this: 这可能是错误的，但是您可以尝试以下操作：

from plumbum.cmd import grep, comm, awk, sort, uniq, sed 
_c1 = awk['{print $1}', 'File1.txt'] | sort | uniq
_c2 = awk['{print $1}', 'File2.txt'] | sort | uniq
chain = comm['-3', _c1(), _c2() ] | grep['-v', '#'] | sed['s/\t//g']
chain()

Let me know if this goes wrong, Will try to fix it. 让我知道这是否出错，将尝试修复它。

Edit: As pointed out, I missed the substitution thing, and I think it would have to be explicitly done by redirecting the above command output to a temporary file and then using that file in the argument to comm. 编辑：正如我指出的那样，我错过了替换的事情，我认为必须通过将以上命令输出重定向到一个临时文件，然后在comm参数中使用该文件来明确地完成替换。

So the above would now actually become: 因此，以上内容实际上变为：

from plumbum.cmd import grep, comm, awk, sort, uniq, sed 
_c1 = awk['{print $1}', 'File1.txt'] | sort | uniq
_c2 = awk['{print $1}', 'File2.txt'] | sort | uniq
(_c1 > "/tmp/File1.txt")(), (_c2 > "/tmp/File2.txt")()
chain = comm['-3', "/tmp/File1.txt", "/tmp/File2.txt" ] | grep['-v', '#'] | sed['s/\t//g']
chain()

Also, alternatively you can use the method described by @charles by making use of mkfifo. 另外，也可以通过使用mkfifo使用@charles描述的方法。

如何从python传递许多bash命令？

问题描述

2 个解决方案

解决方案1
7 已采纳 2017-05-05 20:06:47

解决方案2
0 2017-05-05 20:23:54

如何从python传递许多bash命令？

问题描述

2 个解决方案

解决方案1 7 已采纳 2017-05-05 20:06:47

解决方案2 0 2017-05-05 20:23:54

解决方案1
7 已采纳 2017-05-05 20:06:47

解决方案2
0 2017-05-05 20:23:54