为什么subprocess.Popen不等待子进程终止？

Question

我遇到了Python的subprocess.Popen方法的问题。

这是一个演示问题的测试脚本。 它正在Linux机器上运行。

#!/usr/bin/env python
import subprocess
import time

def run(cmd):
  p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
  return p

### START MAIN
# copy some rows from a source table to a destination table
# note that the destination table is empty when this script is run
cmd = 'mysql -u ve --skip-column-names --batch --execute="insert into destination (select * from source limit 100000)" test'
run(cmd)

# check to see how many rows exist in the destination table
cmd = 'mysql -u ve --skip-column-names --batch --execute="select count(*) from destination" test'
process = run(cmd)
count = (int(process.communicate()[0][:-1]))

# if subprocess.Popen() waited for the child to terminate than count should be
# greater than 0
if count > 0:
  print "success: " + str(count)
else:
  print "failure: " + str(count)
  time.sleep(5)

  # find out how many rows exists in the destination table after sleeping
  process = run(cmd)
  count = (int(process.communicate()[0][:-1]))
  print "after sleeping the count is " + str(count)

通常这个脚本的输出是：

success: 100000

但有时它是

failure: 0
after sleeping the count is 100000

请注意，在失败的情况下，插入后立即显示0行，但在睡眠5秒后，第二次选择正确显示行数为100000.我的结论是以下之一为真：

subprocess.Popen没有等待子线程终止 - 这似乎与文档相矛盾
mysql插入不是原子的 - 我对mysql的理解似乎表明插入是原子的
选择没有立即看到正确的行数 - 根据一个比我更了解mysql的朋友，这也不应该发生

我错过了什么？

仅供参考，我知道这是一种从Python与mysql交互的hacky方式，MySQLdb可能没有这个问题，但我很好奇为什么这个方法不起作用。

Answer 1

subprocess.Popen在实例化时运行程序。 但是，它不会等待它 - 它会在后台触发它，就像你在shell中键入cmd & 。 所以，在上面的代码中，你基本上定义了一个竞争条件 - 如果插入可以及时完成，它将显示正常，但如果没有，你会得到意外的输出。 你不是在等待你的第一次run() 'PID完成，你只是返回它的Popen实例并继续。

我不确定这种行为是如何与文档相矛盾的，因为在Popen上有一些非常明确的方法似乎表明它没有等待，例如：

Popen.wait()
  Wait for child process to terminate. Set and return returncode attribute.

但我同意，可以改进该模块的文档。

要等待程序完成，我建议使用subprocess的便捷方法， subprocess.call ，或者在Popen对象上使用communicate （对于需要stdout的情况）。 您已经为第二次通话做了这个。

### START MAIN
# copy some rows from a source table to a destination table
# note that the destination table is empty when this script is run
cmd = 'mysql -u ve --skip-column-names --batch --execute="insert into destination (select * from source limit 100000)" test'
subprocess.call(cmd)

# check to see how many rows exist in the destination table
cmd = 'mysql -u ve --skip-column-names --batch --execute="select count(*) from destination" test'
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
try: count = (int(process.communicate()[0][:-1]))
except: count = 0

此外，在大多数情况下，您不需要在shell中运行该命令。 这是其中一种情况，但您必须像序列一样重写命令。 这样做也可以避免传统的shell注入，而不用担心引用，如下所示：

prog = ["mysql", "-u", "ve", "--execute", 'insert into foo values ("snargle", 2)']
subprocess.call(prog)

这甚至会起作用，并且不会像你期望的那样注入：

prog = ["printf", "%s", "<", "/etc/passwd"]
subprocess.call(prog)

以交互方式尝试。 您可以避免shell注入的可能性，尤其是在您接受用户输入的情况下。 我怀疑你正在使用与子进程通信的不那么棒的字符串方法，因为你在使序列工作时遇到了麻烦：^）

Answer 2

如果你不是绝对需要使用os.system和popen，那么使用os.system通常更简单。 例如，对于快速脚本，我经常做这样的事情：

import os
run = os.system #convenience alias
result = run('mysql -u ve --execute="select * from wherever" test')

与popen不同， os.system DOES会等待您的进程返回，然后再转到脚本的下一个阶段。

有关它的更多信息，请访问以下文档： http ： //docs.python.org/library/os.html#os.system

Answer 3

伙计，为什么你认为subprocess.Popen返回一个带有wait方法的对象，除非是因为等待不是隐含的，内在的，立即的和不可避免的，因为你似乎猜测......？！产生子进程的最常见原因不是立即等待它完成，而是让它继续（例如在另一个核心上，或者最坏的时候切片 - 这是操作系统的 - 和硬件 - 了望）在父进程继续的同时; 当父进程需要等待子进程完成时，它显然会调用原始subprocess.Process调用返回的对象的wait 。

为什么subprocess.Popen不等待子进程终止？

问题描述

3 个解决方案

解决方案1
21 已采纳 2009-10-09 01:05:31

解决方案2
7 2009-10-09 03:54:09

解决方案3
3 2009-10-09 03:51:15

为什么subprocess.Popen不等待子进程终止？

问题描述

3 个解决方案

解决方案1 21 已采纳 2009-10-09 01:05:31

解决方案2 7 2009-10-09 03:54:09

解决方案3 3 2009-10-09 03:51:15

解决方案1
21 已采纳 2009-10-09 01:05:31

解决方案2
7 2009-10-09 03:54:09

解决方案3
3 2009-10-09 03:51:15