简体   繁体   English

为什么subprocess.Popen不等待子进程终止?

[英]Why is subprocess.Popen not waiting until the child process terminates?

I'm having a problem with Python's subprocess.Popen method. 我遇到了Python的subprocess.Popen方法的问题。

Here's a test script which demonstrates the problem. 这是一个演示问题的测试脚本。 It's being run on a Linux box. 它正在Linux机器上运行。

#!/usr/bin/env python
import subprocess
import time

def run(cmd):
  p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
  return p

### START MAIN
# copy some rows from a source table to a destination table
# note that the destination table is empty when this script is run
cmd = 'mysql -u ve --skip-column-names --batch --execute="insert into destination (select * from source limit 100000)" test'
run(cmd)

# check to see how many rows exist in the destination table
cmd = 'mysql -u ve --skip-column-names --batch --execute="select count(*) from destination" test'
process = run(cmd)
count = (int(process.communicate()[0][:-1]))

# if subprocess.Popen() waited for the child to terminate than count should be
# greater than 0
if count > 0:
  print "success: " + str(count)
else:
  print "failure: " + str(count)
  time.sleep(5)

  # find out how many rows exists in the destination table after sleeping
  process = run(cmd)
  count = (int(process.communicate()[0][:-1]))
  print "after sleeping the count is " + str(count)

Usually the output from this script is: 通常这个脚本的输出是:

success: 100000

but sometimes it's 但有时它是

failure: 0
after sleeping the count is 100000

Note that in the failure case, the select immediately after the insert shows 0 rows but after sleeping for 5 seconds a second select correctly shows a row count of 100000. My conclusion is that one of the following is true: 请注意,在失败的情况下,插入后立即显示0行,但在睡眠5秒后,第二次选择正确显示行数为100000.我的结论是以下之一为真:

  1. subprocess.Popen is not waiting for the child thread to terminate - This seems to contradict the documentation subprocess.Popen没有等待子线程终止 - 这似乎与文档相矛盾
  2. the mysql insert is not atomic - my understanding of mysql seems to indicate insert is atomic mysql插入不是原子的 - 我对mysql的理解似乎表明插入是原子的
  3. the select is not seeing the correct row count right away - according to a friend who knows mysql better than I do this should not happen either 选择没有立即看到正确的行数 - 根据一个比我更了解mysql的朋友,这也不应该发生

What am I missing? 我错过了什么?

FYI, I'm aware that this is a hacky way of interacting with mysql from Python and MySQLdb would likely not have this problem but I'm curious as to why this method does not work. 仅供参考,我知道这是一种从Python与mysql交互的hacky方式,MySQLdb可能没有这个问题,但我很好奇为什么这个方法不起作用。

subprocess.Popen , when instantiated, runs the program. subprocess.Popen在实例化时运行程序。 It does not, however, wait for it -- it fires it off in the background as if you'd typed cmd & in a shell. 但是,它不会等待它 - 它会在后台触发它,就像你在shell中键入cmd & So, in the code above, you've essentially defined a race condition -- if the inserts can finish in time, it will appear normal, but if not you get the unexpected output. 所以,在上面的代码中,你基本上定义了一个竞争条件 - 如果插入可以及时完成,它将显示正常,但如果没有,你会得到意外的输出。 You are not waiting for your first run() 'd PID to finish, you are simply returning its Popen instance and continuing. 你不是在等待你的第一次run() 'PID完成,你只是返回它的Popen实例并继续。

I'm not sure how this behavior contradicts the documentation, because there's some very clear methods on Popen that seem to indicate it is not waited for, like: 我不确定这种行为是如何与文档相矛盾的,因为在Popen上有一些非常明确的方法似乎表明它没有等待,例如:

Popen.wait()
  Wait for child process to terminate. Set and return returncode attribute.

I do agree, however, that the documentation for this module could be improved. 但我同意,可以改进该模块的文档。

To wait for the program to finish, I'd recommend using subprocess 's convenience method, subprocess.call , or using communicate on a Popen object (for the case when you need stdout). 要等待程序完成,我建议使用subprocess的便捷方法, subprocess.call ,或者在Popen对象上使用communicate (对于需要stdout的情况)。 You are already doing this for your second call. 您已经为第二次通话做了这个。

### START MAIN
# copy some rows from a source table to a destination table
# note that the destination table is empty when this script is run
cmd = 'mysql -u ve --skip-column-names --batch --execute="insert into destination (select * from source limit 100000)" test'
subprocess.call(cmd)

# check to see how many rows exist in the destination table
cmd = 'mysql -u ve --skip-column-names --batch --execute="select count(*) from destination" test'
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
try: count = (int(process.communicate()[0][:-1]))
except: count = 0

Additionally, in most cases, you do not need to run the command in a shell. 此外,在大多数情况下,您不需要在shell中运行该命令。 This is one of those cases, but you'll have to rewrite your command like a sequence. 这是其中一种情况,但您必须像序列一样重写命令。 Doing it that way also allows you to avoid traditional shell injection and worry less about quoting, like so: 这样做也可以避免传统的shell注入,而不用担心引用,如下所示:

prog = ["mysql", "-u", "ve", "--execute", 'insert into foo values ("snargle", 2)']
subprocess.call(prog)

This will even work, and will not inject as you'd expect: 这甚至会起作用,并且不会像你期望的那样注入:

prog = ["printf", "%s", "<", "/etc/passwd"]
subprocess.call(prog)

Try it interactively. 以交互方式尝试。 You avoid the possibilities of shell injection, particularly if you're accepting user input. 您可以避免shell注入的可能性,尤其是在您接受用户输入的情况下。 I suspect you're using the less-awesome string method of communicating with subprocess because you ran into trouble getting the sequences to work :^) 我怀疑你正在使用与子进程通信的不那么棒的字符串方法,因为你在使序列工作时遇到了麻烦:^)

If you don't absolutely need to use subprocess and popen, it's usually simpler to use os.system . 如果你不是绝对需要使用os.system和popen,那么使用os.system通常更简单。 For example, for quick scripts I often do something like this: 例如,对于快速脚本,我经常做这样的事情:

import os
run = os.system #convenience alias
result = run('mysql -u ve --execute="select * from wherever" test')

Unlike popen, os.system DOES wait for your process to return before moving on to the next stage of your script. 与popen不同, os.system DOES会等待您的进程返回,然后再转到脚本的下一个阶段。

More info on it in the docs: http://docs.python.org/library/os.html#os.system 有关它的更多信息,请访问以下文档: http//docs.python.org/library/os.html#os.system

Dude, why did you think subprocess.Popen returned an object with a wait method, unless it was because the waiting was NOT implicit, intrinsic, immediate, and inevitable, as you appear to surmise...?! 伙计,为什么你认为subprocess.Popen返回一个带有wait方法的对象,除非是因为等待不是隐含的,内在的,立即的和不可避免的,因为你似乎猜测......?! The most common reason to spawn a subprocess is NOT to immediately wait for it to finish, but rather to let it proceed (eg on another core, or at worst by time-slicing -- that's the operating system's -- and hardware's -- lookout) at the same time as the parent process continues; 产生子进程的最常见原因不是立即等待它完成,而是让它继续(例如在另一个核心上,或者最坏的时候切片 - 这是操作系统的 - 和硬件 - 了望)在父进程继续的同时; when the parent process needs to wait for the subprocess to be finished, it will obviously call wait on the object returned by the original subprocess.Process call. 当父进程需要等待子进程完成时,它显然会调用原始subprocess.Process调用返回的对象的wait

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM