Python 分別從子進程 stdout 和 stderr 讀取，同時保留順序

Question

我有一個 python 子進程，我試圖從中讀取輸出和錯誤流。 目前我可以使用它，但是我只能在完成從stdout讀取后才能從stderr讀取。 這是它的樣子：

process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout_iterator = iter(process.stdout.readline, b"")
stderr_iterator = iter(process.stderr.readline, b"")

for line in stdout_iterator:
    # Do stuff with line
    print line

for line in stderr_iterator:
    # Do stuff with line
    print line

如您所見， stderr for 循環在stdout循環完成之前無法啟動。 如何修改它以便能夠以正確的順序讀取行進來的兩者？

澄清一下：我仍然需要能夠判斷一行是來自stdout還是stderr因為它們在我的代碼中將被區別對待。

Answer 1

如果子進程在 stderr 上產生足夠的輸出（在我的 Linux 機器上約為 100KB），那么您問題中的代碼可能會死鎖。

有一個communicate()方法允許分別從 stdout 和 stderr 讀取：

from subprocess import Popen, PIPE

process = Popen(command, stdout=PIPE, stderr=PIPE)
output, err = process.communicate()

如果您需要在子進程仍在運行時讀取流，那么可移植的解決方案是使用線程（未測試）：

from subprocess import Popen, PIPE
from threading import Thread
from Queue import Queue # Python 2

def reader(pipe, queue):
    try:
        with pipe:
            for line in iter(pipe.readline, b''):
                queue.put((pipe, line))
    finally:
        queue.put(None)

process = Popen(command, stdout=PIPE, stderr=PIPE, bufsize=1)
q = Queue()
Thread(target=reader, args=[process.stdout, q]).start()
Thread(target=reader, args=[process.stderr, q]).start()
for _ in range(2):
    for source, line in iter(q.get, None):
        print "%s: %s" % (source, line),

看：

Answer 2

這是一個基於selectors的解決方案，但它保留了順序，並流式傳輸可變長度字符（甚至單個字符）。

訣竅是使用read1() ，而不是read() 。

import selectors
import subprocess
import sys

p = subprocess.Popen(
    ["python", "random_out.py"], stdout=subprocess.PIPE, stderr=subprocess.PIPE
)

sel = selectors.DefaultSelector()
sel.register(p.stdout, selectors.EVENT_READ)
sel.register(p.stderr, selectors.EVENT_READ)

while True:
    for key, _ in sel.select():
        data = key.fileobj.read1().decode()
        if not data:
            exit()
        if key.fileobj is p.stdout:
            print(data, end="")
        else:
            print(data, end="", file=sys.stderr)

如果您想要一個測試程序，請使用它。

import sys
from time import sleep


for i in range(10):
    print(f" x{i} ", file=sys.stderr, end="")
    sleep(0.1)
    print(f" y{i} ", end="")
    sleep(0.1)

Answer 3

進程將數據寫入不同管道的順序在寫入后丟失。

您無法判斷 stdout 是否已在 stderr 之前寫入。

一旦數據可用，您可以嘗試以非阻塞方式同時從多個文件描述符中讀取數據，但這只會最大限度地減少順序不正確的可能性。

這個程序應該證明這一點：

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import os
import select
import subprocess

testapps={
    'slow': '''
import os
import time
os.write(1, 'aaa')
time.sleep(0.01)
os.write(2, 'bbb')
time.sleep(0.01)
os.write(1, 'ccc')
''',
    'fast': '''
import os
os.write(1, 'aaa')
os.write(2, 'bbb')
os.write(1, 'ccc')
''',
    'fast2': '''
import os
os.write(1, 'aaa')
os.write(2, 'bbbbbbbbbbbbbbb')
os.write(1, 'ccc')
'''
}

def readfds(fds, maxread):
    while True:
        fdsin, _, _ = select.select(fds,[],[])
        for fd in fdsin:
            s = os.read(fd, maxread)
            if len(s) == 0:
                fds.remove(fd)
                continue
            yield fd, s
        if fds == []:
            break

def readfromapp(app, rounds=10, maxread=1024):
    f=open('testapp.py', 'w')
    f.write(testapps[app])
    f.close()

    results={}
    for i in range(0, rounds):
        p = subprocess.Popen(['python', 'testapp.py'], stdout=subprocess.PIPE
                                                     , stderr=subprocess.PIPE)
        data=''
        for (fd, s) in readfds([p.stdout.fileno(), p.stderr.fileno()], maxread):
            data = data + s
        results[data] = results[data] + 1 if data in results else 1

    print 'running %i rounds %s with maxread=%i' % (rounds, app, maxread)
    results = sorted(results.items(), key=lambda (k,v): k, reverse=False)
    for data, count in results:
        print '%03i x %s' % (count, data)


print
print "=> if output is produced slowly this should work as whished"
print "   and should return: aaabbbccc"
readfromapp('slow',  rounds=100, maxread=1024)

print
print "=> now mostly aaacccbbb is returnd, not as it should be"
readfromapp('fast',  rounds=100, maxread=1024)

print
print "=> you could try to read data one by one, and return"
print "   e.g. a whole line only when LF is read"
print "   (b's should be finished before c's)"
readfromapp('fast',  rounds=100, maxread=1)

print
print "=> but even this won't work ..."
readfromapp('fast2', rounds=100, maxread=1)

並輸出如下內容：

=> if output is produced slowly this should work as whished
   and should return: aaabbbccc
running 100 rounds slow with maxread=1024
100 x aaabbbccc

=> now mostly aaacccbbb is returnd, not as it should be
running 100 rounds fast with maxread=1024
006 x aaabbbccc
094 x aaacccbbb

=> you could try to read data one by one, and return
   e.g. a whole line only when LF is read
   (b's should be finished before c's)
running 100 rounds fast with maxread=1
003 x aaabbbccc
003 x aababcbcc
094 x abababccc

=> but even this won't work ...
running 100 rounds fast2 with maxread=1
003 x aaabbbbbbbbbbbbbbbccc
001 x aaacbcbcbbbbbbbbbbbbb
008 x aababcbcbcbbbbbbbbbbb
088 x abababcbcbcbbbbbbbbbb

Answer 4

這適用於 Python3 (3.6)：

    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, 
                         stderr=subprocess.PIPE, universal_newlines=True)
    # Read both stdout and stderr simultaneously
    sel = selectors.DefaultSelector()
    sel.register(p.stdout, selectors.EVENT_READ)
    sel.register(p.stderr, selectors.EVENT_READ)
    ok = True
    while ok:
        for key, val1 in sel.select():
            line = key.fileobj.readline()
            if not line:
                ok = False
                break
            if key.fileobj is p.stdout:
                print(f"STDOUT: {line}", end="")
            else:
                print(f"STDERR: {line}", end="", file=sys.stderr)

Answer 5

來自https://docs.python.org/3/library/subprocess.html#using-the-subprocess-module

如果您希望捕獲兩個流並將其合並為一個，請使用 stdout=PIPE 和 stderr=STDOUT 而不是 capture_output。

所以最簡單的解決方案是：

process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
stdout_iterator = iter(process.stdout.readline, b"")

for line in stdout_iterator:
    # Do stuff with line
    print line

Answer 6

我有一個python子進程，我正在嘗試從中讀取輸出和錯誤流。 目前，我已經可以使用它了，但是只有從stdout讀取完之后，我才能從stderr讀取內容。 看起來是這樣的：

process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout_iterator = iter(process.stdout.readline, b"")
stderr_iterator = iter(process.stderr.readline, b"")

for line in stdout_iterator:
    # Do stuff with line
    print line

for line in stderr_iterator:
    # Do stuff with line
    print line

如您所見， stderr for循環在stdout循環完成之前無法啟動。 我如何修改它以便能夠以正確的順序從行中讀取兩者？

需要說明的是：我仍然需要能夠分辨出一行是來自stdout還是stderr因為它們在我的代碼中將被區別對待。

Answer 7

我知道這個問題很老了，但是這個答案可能會幫助偶然發現此頁面的其他人研究類似情況的解決方案，所以我還是發布了它。

我已經構建了一個簡單的 python 片段，它將任意數量的管道合並為一個。 當然，如上所述，順序不能保證，但這與我認為您在 Python 中所能獲得的最接近。

它為每個管道生成一個線程，逐行讀取它們並將它們放入隊列（即 FIFO）。 主線程循環遍歷隊列，產生每一行。

import threading, queue
def merge_pipes(**named_pipes):
    r'''
    Merges multiple pipes from subprocess.Popen (maybe other sources as well).
    The keyword argument keys will be used in the output to identify the source
    of the line.

    Example:
    p = subprocess.Popen(['some', 'call'],
                         stdin=subprocess.PIPE,
                         stdout=subprocess.PIPE,
                         stderr=subprocess.PIPE)
    outputs = {'out': log.info, 'err': log.warn}
    for name, line in merge_pipes(out=p.stdout, err=p.stderr):
        outputs[name](line)

    This will output stdout to the info logger, and stderr to the warning logger
    '''

    # Constants. Could also be placed outside of the method. I just put them here
    # so the method is fully self-contained
    PIPE_OPENED=1
    PIPE_OUTPUT=2
    PIPE_CLOSED=3

    # Create a queue where the pipes will be read into
    output = queue.Queue()

    # This method is the run body for the threads that are instatiated below
    # This could be easily rewritten to be outside of the merge_pipes method,
    # but to make it fully self-contained I put it here
    def pipe_reader(name, pipe):
        r"""
        reads a single pipe into the queue
        """
        output.put( ( PIPE_OPENED, name, ) )
        try:
            for line in iter(pipe.readline,''):
                output.put( ( PIPE_OUTPUT, name, line.rstrip(), ) )
        finally:
            output.put( ( PIPE_CLOSED, name, ) )

    # Start a reader for each pipe
    for name, pipe in named_pipes.items():
        t=threading.Thread(target=pipe_reader, args=(name, pipe, ))
        t.daemon = True
        t.start()

    # Use a counter to determine how many pipes are left open.
    # If all are closed, we can return
    pipe_count = 0

    # Read the queue in order, blocking if there's no data
    for data in iter(output.get,''):
        code=data[0]
        if code == PIPE_OPENED:
            pipe_count += 1
        elif code == PIPE_CLOSED:
            pipe_count -= 1
        elif code == PIPE_OUTPUT:
            yield data[1:]
        if pipe_count == 0:
            return

Answer 8

這對我有用（在 Windows 上）： https : //github.com/waszil/subpiper

from subpiper import subpiper

def my_stdout_callback(line: str):
    print(f'STDOUT: {line}')

def my_stderr_callback(line: str):
    print(f'STDERR: {line}')

my_additional_path_list = [r'c:\important_location']

retcode = subpiper(cmd='echo magic',
                   stdout_callback=my_stdout_callback,
                   stderr_callback=my_stderr_callback,
                   add_path_list=my_additional_path_list)

Answer 9

我有一個python子進程，我正在嘗試從中讀取輸出和錯誤流。 目前，我已經可以使用它了，但是只有從stdout讀取完之后，我才能從stderr讀取內容。 看起來是這樣的：

process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout_iterator = iter(process.stdout.readline, b"")
stderr_iterator = iter(process.stderr.readline, b"")

for line in stdout_iterator:
    # Do stuff with line
    print line

for line in stderr_iterator:
    # Do stuff with line
    print line

如您所見， stderr for循環在stdout循環完成之前無法啟動。 我如何修改它以便能夠以正確的順序從行中讀取兩者？

需要說明的是：我仍然需要能夠分辨出一行是來自stdout還是stderr因為它們在我的代碼中將被區別對待。

Python 分別從子進程 stdout 和 stderr 讀取，同時保留順序

問題描述

7 個解決方案

解決方案1
31 2015-08-06 23:39:27

解決方案2
8 已采納 2019-07-07 00:28:58

解決方案3
6 2015-09-23 14:10:21

解決方案4
4 2020-05-04 04:16:46

解決方案5
1 2021-02-16 07:58:50

解決方案6
0 2015-08-06 04:21:33

解決方案7
0 2018-08-03 08:58:09

解決方案8
0 2019-05-21 07:39:10

解決方案9
-2 2016-08-12 12:31:37

Python 分別從子進程 stdout 和 stderr 讀取，同時保留順序

問題描述

7 個解決方案

解決方案1 31 2015-08-06 23:39:27

解決方案2 8 已采納 2019-07-07 00:28:58

解決方案3 6 2015-09-23 14:10:21

解決方案4 4 2020-05-04 04:16:46

解決方案5 1 2021-02-16 07:58:50

解決方案6 0 2015-08-06 04:21:33

解決方案7 0 2018-08-03 08:58:09

解決方案8 0 2019-05-21 07:39:10

解決方案9 -2 2016-08-12 12:31:37

解決方案1
31 2015-08-06 23:39:27

解決方案2
8 已采納 2019-07-07 00:28:58

解決方案3
6 2015-09-23 14:10:21

解決方案4
4 2020-05-04 04:16:46

解決方案5
1 2021-02-16 07:58:50

解決方案6
0 2015-08-06 04:21:33

解決方案7
0 2018-08-03 08:58:09

解決方案8
0 2019-05-21 07:39:10

解決方案9
-2 2016-08-12 12:31:37