在popen.stdout.readline上检测流的结尾

Question

I have a python program which launches subprocesses using Popen and consumes their output nearly real-time as it is produced. 我有一个python程序，它使用Popen启动子Popen并在生成时几乎实时地消耗它们的输出。 The code of the relevant loop is: 相关循环的代码是：

def run(self, output_consumer):
    self.prepare_to_run()
    popen_args = self.get_popen_args()
    logging.debug("Calling popen with arguments %s" % popen_args)
    self.popen = subprocess.Popen(**popen_args)
    while True:
        outdata = self.popen.stdout.readline()
        if not outdata and self.popen.returncode is not None:
            # Terminate when we've read all the output and the returncode is set
            break
        output_consumer.process_output(outdata)
        self.popen.poll()  # updates returncode so we can exit the loop
    output_consumer.finish(self.popen.returncode)
    self.post_run()

def get_popen_args(self):
    return {
        'args': self.command,
        'shell': False, # Just being explicit for security's sake
        'bufsize': 0,   # More likely to see what's being printed as it happens
                        # Not guarantted since the process itself might buffer its output
                        # run `python -u` to unbuffer output of a python processes
        'cwd': self.get_cwd(),
        'env': self.get_environment(),
        'stdout': subprocess.PIPE,
        'stderr': subprocess.STDOUT,
        'close_fds': True,  # Doesn't seem to matter
    }

This works great on my production machines, but on my dev machine, the call to .readline() hangs when certain subprocesses complete. 这在我的生产机器上运行良好，但在我的开发机器上，当某些子.readline()完成时，对.readline()的调用会挂起。 That is, it will successfully process all of the output, including the final output line saying "process complete", but then will again poll readline and never return. 也就是说，它将成功处理所有输出，包括最后输出行说“过程完成”，但随后将再次轮询readline并且永不返回。 This method exits properly on the dev machine for most of the sub-processes I call, but consistently fails to exit for one complex bash script that itself calls many sub-processes. 对于我调用的大多数子进程，此方法在dev机器上正确退出，但始终无法退出一个本身调用许多子进程的复杂bash脚本。

It's worth noting that popen.returncode gets set to a non- None (usually 0 ) value many lines before the end of the output. 值得注意的是， popen.returncode在输出结束之前被设置为非None （通常为0 ）值很多行。 So I can't just break out of the loop when that is set or else I lose everything that gets spat out at the end of the process and is still buffered waiting for reading. 所以我不能在设置它时突然退出循环，否则我会失去在进程结束时吐出来的所有东西，并且仍然在缓冲等待阅读。 The problem is that when I'm flushing the buffer at that point, I can't tell when I'm at the end because the last call to readline() hangs. 问题是当我在那时刷新缓冲区时，我无法判断我到底是什么时候因为最后一次调用readline()挂起。 Calling read() also hangs. 调用read()也会挂起。 Calling read(1) gets me every last character out, but also hangs after the final line. 调用read(1)可以获取最后一个字符，但也会在最后一行之后挂起。 popen.stdout.closed is always False . popen.stdout.closed始终为False 。 How can I tell when I'm at the end? 我怎么知道我什么时候结束？

All systems are running python 2.7.3 on Ubuntu 12.04LTS. 所有系统都在Ubuntu 12.04LTS上运行python 2.7.3。 FWIW, stderr is being merged with stdout using stderr=subprocess.STDOUT . FWIW， stderr正在使用stderr=subprocess.STDOUT与stdout合并。

Why the difference? 为什么不同？ Is it failing to close stdout for some reason? 是不是因为某些原因关闭了stdout ？ Could the sub-sub-processes do something to keep it open somehow? 子流程可以做些什么来保持它以某种方式打开吗？ Could it be because I'm launching the process from a terminal on my dev box, but in production it's launched as a daemon through supervisord ? 可能是因为我从开发盒上的终端启动了这个过程，但是在生产中它是通过supervisord作为守护进程启动的吗？ Would that change the way the pipes are processed and if so how do I normalize them? 这会改变管道的处理方式吗？如果是这样，我该如何规范它们？

Answer 1

The main code loop looks right. 主代码循环看起来正确。 It could be that the pipe isn't closing because another process is keeping it open. 可能是管道没有关闭，因为另一个进程保持打开状态。 For example, if script launches a background process that writes to stdout then the pipe will no close. 例如，如果脚本启动写入stdout的后台进程，则管道将不会关闭。 Are you sure no other child process still running? 您确定没有其他子进程仍在运行吗？

An idea is to change modes when you see the .returncode has set. 一个想法是当你看到.returncode设置时改变模式。 Once you know the main process is done, read all its output from buffer, but don't get stuck waiting. 一旦你知道主进程已经完成，从缓冲区读取它的所有输出，但不要等待等待。 You can use select to read from the pipe with a timeout. 您可以使用select从超时读取管道。 Set a several seconds timeout and you can clear the buffer without getting stuck waiting child process. 设置几秒钟超时，您可以清除缓冲区而不会遇到等待子进程的问题。

Answer 2

If you use readline() or read(), it should not hang. 如果使用readline（）或read（），则不应挂起。 No need to check returncode or poll(). 无需检查returncode或poll（）。 If it is hanging when you know the process is finished, it is most probably a subprocess keeping your pipe open, as others said before. 如果它在您知道过程完成时挂起，则很可能是一个子过程保持管道打开，正如其他人之前所说的那样。

There are two things you could do to debug this: * Try to reproduce with a minimal script instead of the current complex one, or * Run that complex script with strace -f -e clone,execve,exit_group and see what is that script starting, and if any process is surviving the main script (check when the main script calls exit_group, if strace is still waiting after that, you have a child still alive). 你可以做两件事来调试这个：*尝试用最小的脚本而不是当前复杂的脚本重现，或者*用strace -f -e clone,execve,exit_group运行那个复杂的脚本，看看那个脚本是什么开始的，如果任何进程在主脚本中存活（检查主脚本何时调用exit_group，如果strace在此之后仍在等待，则表示您的孩子仍然活着）。

Answer 3

Without knowing the contents of the "one complex bash script" which causes the problem, there's too many possibilities to determine the exact cause. 在不知道导致问题的“一个复杂的bash脚本”的内容的情况下，确定确切原因的可能性太多了。

However, focusing on the fact that you claim it works if you run your Python script under supervisord , then it might be getting stuck if a sub-process is trying to read from stdin, or just behaves differently if stdin is a tty, which (I presume) supervisord will redirect from /dev/null . 但是，如果你在supervisord下运行你的Python脚本，那么关注你声称它有用的事实，那么如果一个子进程试图从stdin读取它可能会卡住，或者如果stdin是tty则表现不同，这（我推测） supervisord将从/dev/null重定向。

This minimal example seems to cope better with cases where my example test.sh runs subprocesses which try to read from stdin... 这个最小的例子似乎更好地处理我的示例test.sh运行尝试从stdin读取的子进程的情况...

import os
import subprocess

f = subprocess.Popen(args='./test.sh',
                     shell=False,
                     bufsize=0,
                     stdin=open(os.devnull, 'rb'),
                     stdout=subprocess.PIPE,
                     stderr=subprocess.STDOUT,
                     close_fds=True)

while 1:
    s = f.stdout.readline()
    if not s and f.returncode is not None:
        break
    print s.strip()
    f.poll()
print "done %d" % f.returncode

Otherwise, you can always fall back to using a non-blocking read , and bail out when you get your final output line saying "process complete", although it's a bit of a hack. 否则，你总是可以回到使用非阻塞读取，并在你的最终输出行说“进程完成”时挽救，尽管这有点像黑客攻击。

Answer 4

I find that calls to read (or readline ) sometimes hang, despite previously calling poll . 我发现，调用read （或readline ）有时候会被挂起，尽管此前呼吁poll 。 So I resorted to calling select to find out if there is readable data. 所以我使用select来查找是否有可读数据。 However, select without a timeout can hang, too, if the process was closed. 但是，如果进程已关闭，那么没有超时的select也会挂起。 So I call select in a semi-busy loop with a tiny timeout for each iteration (see below). 所以我在一个半繁忙的循环中调用select，每次迭代都有一个很小的超时（见下文）。

I'm not sure if you can adapt this to readline, as readline might hang if the final \\n is missing, or if the process doesn't close its stdout before you close its stdin and/or terminate it. 我不确定你是否可以将它改为readline，因为如果缺少最后的\\n ，readline可能会挂起，或者如果进程在关闭stdin和/或终止它之前没有关闭它的stdout。 You could wrap this in a generator, and everytime you encounter a \\n in stdout_collected, yield the current line. 您可以将它包装在生成器中，并且每次在stdout_collected中遇到\\n ，都会生成当前行。

Also note that in my actual code, I'm using pseudoterminals (pty) to wrap the popen handles (to more closely fake user input) but it should work without. 另请注意，在我的实际代码中，我使用伪终端（pty）来包装popen句柄（更严格地伪造用户输入），但它应该没有。

# handle to read from
handle = self.popen.stdout

# how many seconds to wait without data
timeout = 1

begin = datetime.now()
stdout_collected = ""

while self.popen.poll() is None:
    try:
        fds = select.select([handle], [], [], 0.01)[0]
    except select.error, exc:
        print exc
        break

    if len(fds) == 0:
        # select timed out, no new data
        delta = (datetime.now() - begin).total_seconds()
        if delta > timeout:
            return stdout_collected

        # try longer
        continue
    else:
        # have data, timeout counter resets again
        begin = datetime.now()

    for fd in fds:
        if fd == handle:
            data = os.read(handle, 1024)
            # can handle the bytes as they come in here
            # self._handle_stdout(data)
            stdout_collected += data

# process exited
# if using a pseudoterminal, close the handles here
self.popen.wait()

Answer 5

Why are you setting the sdterr to STDOUT? 为什么要将sdterr设置为STDOUT？

The real benefit of making a communicate() call on a subproces is that you are able to retrieve a tuple containining the stdout response as well as the stderr meesage. 在子进程上进行communication（）调用的真正好处是，您可以检索包含stdout响应以及stderr meesage的元组。

Those might be useful if the logic depends on their succsss or failure. 如果逻辑取决于他们的成功或失败，那些可能是有用的。

Also, it would save you from the pain of having to iterate through lines. 而且，它可以帮助您避免必须遍历行的痛苦。 Communicate() gives you everything and there would be no unresolved questions about whether or not the full message was received Communicate（）为您提供了所有内容，并且不会有关于是否收到完整邮件的未解决的问题

Answer 6

I wrote a demo with bash subprocess that can be easy explored. 我用bash子进程编写了一个可以轻松探索的演示。 A closed pipe can be recognized by '' in the output from readline() , while the output from an empty line is '\\n' . 闭合管道可以通过readline()的输出中的''识别，而空行的输出是'\\n' 。

from subprocess import Popen, PIPE, STDOUT
p = Popen(['bash'], stdout=PIPE, stderr=STDOUT)
out = []
while True:
    outdata = p.stdout.readline()
    if not outdata:
        break
    #output_consumer.process_output(outdata)
    print "* " + repr(outdata)
    out.append(outdata)
print "* closed", repr(out)
print "* returncode", p.wait()

Example of input/output : Closing the pipe distinctly before terminating the process . 输入/输出示例 ： 在终止进程之前清楚地关闭管道 。 That is why wait() should be used instead of poll() 这就是为什么应该使用wait()而不是poll()

[prompt] $ python myscript.py
echo abc
* 'abc\n'
exec 1>&- # close stdout
exec 2>&- # close stderr
* closed ['abc\n']
exit
* returncode 0
[prompt] $

Your code did output a huge number of empty strings for this case. 在这种情况下，您的代码确实输出了大量空字符串。

Example : Fast terminated process without '\\n' on the last line: 示例：最后一行没有'\\n'快速终止进程：

echo -n abc
exit
* 'abc'
* closed ['abc']
* returncode 0

在popen.stdout.readline上检测流的结尾

问题描述

6 个解决方案

解决方案1
2 已采纳 2013-04-25 22:57:06

解决方案2
2 2013-04-26 22:19:39

解决方案3
1 2013-04-26 10:45:43

解决方案4
1 2013-04-30 13:06:59

解决方案5
0 2013-04-27 09:44:51

解决方案6
0 2013-04-30 18:42:07

在popen.stdout.readline上检测流的结尾

问题描述

6 个解决方案

解决方案1 2 已采纳 2013-04-25 22:57:06

解决方案2 2 2013-04-26 22:19:39

解决方案3 1 2013-04-26 10:45:43

解决方案4 1 2013-04-30 13:06:59

解决方案5 0 2013-04-27 09:44:51

解决方案6 0 2013-04-30 18:42:07

解决方案1
2 已采纳 2013-04-25 22:57:06

解决方案2
2 2013-04-26 22:19:39

解决方案3
1 2013-04-26 10:45:43

解决方案4
1 2013-04-30 13:06:59

解决方案5
0 2013-04-27 09:44:51

解决方案6
0 2013-04-30 18:42:07