简体   繁体   English

从子进程实时捕获标准输出

[英]catching stdout in realtime from subprocess

I want to subprocess.Popen() rsync.exe in Windows, and print the stdout in Python.我想在 Windows 中 subprocess.Popen subprocess.Popen() rsync.exe,并在 Python 中打印标准输出。

My code works, but it doesn't catch the progress until a file transfer is done!我的代码有效,但在文件传输完成之前它无法捕捉进度! I want to print the progress for each file in real time.我想实时打印每个文件的进度。

Using Python 3.1 now since I heard it should be better at handling IO.现在使用 Python 3.1,因为我听说它应该更擅长处理 IO。

import subprocess, time, os, sys

cmd = "rsync.exe -vaz -P source/ dest/"
p, line = True, 'start'


p = subprocess.Popen(cmd,
                     shell=True,
                     bufsize=64,
                     stdin=subprocess.PIPE,
                     stderr=subprocess.PIPE,
                     stdout=subprocess.PIPE)

for line in p.stdout:
    print(">>> " + str(line.rstrip()))
    p.stdout.flush()

Some rules of thumb for subprocess . subprocess一些经验法则。

  • Never use shell=True .永远不要使用shell=True It needlessly invokes an extra shell process to call your program.它不必要地调用一个额外的 shell 进程来调用你的程序。
  • When calling processes, arguments are passed around as lists.调用进程时,参数作为列表传递。 sys.argv in python is a list, and so is argv in C. So you pass a list to Popen to call subprocesses, not a string. python 中的sys.argv是一个列表,C 中的argv也是一个列表。所以你将一个列表传递给Popen来调用子进程,而不是一个字符串。
  • Don't redirect stderr to a PIPE when you're not reading it.不阅读时不要将stderr重定向到PIPE
  • Don't redirect stdin when you're not writing to it.不写入时不要重定向stdin

Example:例子:

import subprocess, time, os, sys
cmd = ["rsync.exe", "-vaz", "-P", "source/" ,"dest/"]

p = subprocess.Popen(cmd,
                     stdout=subprocess.PIPE,
                     stderr=subprocess.STDOUT)

for line in iter(p.stdout.readline, b''):
    print(">>> " + line.rstrip())

That said, it is probable that rsync buffers its output when it detects that it is connected to a pipe instead of a terminal.也就是说,当 rsync 检测到它连接到管道而不是终端时,它可能会缓冲其输出。 This is the default behavior - when connected to a pipe, programs must explicitly flush stdout for realtime results, otherwise standard C library will buffer.这是默认行为 - 当连接到管道时,程序必须显式刷新标准输出以获得实时结果,否则标准 C 库将缓冲。

To test for that, try running this instead:要对此进行测试,请尝试运行它:

cmd = [sys.executable, 'test_out.py']

and create a test_out.py file with the contents:并创建一个包含以下内容的test_out.py文件:

import sys
import time
print ("Hello")
sys.stdout.flush()
time.sleep(10)
print ("World")

Executing that subprocess should give you "Hello" and wait 10 seconds before giving "World".执行该子进程应该给你“你好”并等待 10 秒钟,然后再给“世界”。 If that happens with the python code above and not with rsync , that means rsync itself is buffering output, so you are out of luck.如果上面的 python 代码发生这种情况而不是rsync ,那意味着rsync本身正在缓冲输出,所以你运气不好。

A solution would be to connect direct to a pty , using something like pexpect .一种解决方案是使用pexpect东西直接连接到pty

I know this is an old topic, but there is a solution now.我知道这是一个老话题,但现在有一个解决方案。 Call the rsync with option --outbuf=L.使用选项 --outbuf=L 调用 rsync。 Example:例子:

cmd=['rsync', '-arzv','--backup','--outbuf=L','source/','dest']
p = subprocess.Popen(cmd,
                     stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, b''):
    print '>>> {}'.format(line.rstrip())

Depending on the use case, you might also want to disable the buffering in the subprocess itself.根据用例,您可能还想禁用子流程本身的缓冲。

If the subprocess will be a Python process, you could do this before the call:如果子进程将是一个 Python 进程,您可以在调用之前执行此操作:

os.environ["PYTHONUNBUFFERED"] = "1"

Or alternatively pass this in the env argument to Popen .或者将其在env参数中传递给Popen

Otherwise, if you are on Linux/Unix, you can use the stdbuf tool.否则,如果您使用的是 Linux/Unix,则可以使用stdbuf工具。 Eg like:例如:

cmd = ["stdbuf", "-oL"] + cmd

See also here about stdbuf or other options.另请参阅此处了解stdbuf或其他选项。

On Linux, I had the same problem of getting rid of the buffering.在 Linux 上,我遇到了摆脱缓冲的同样问题。 I finally used "stdbuf -o0" (or, unbuffer from expect) to get rid of the PIPE buffering.我最终使用“stdbuf -o0”(或者,从expect 中取消缓冲)来摆脱PIPE 缓冲。

proc = Popen(['stdbuf', '-o0'] + cmd, stdout=PIPE, stderr=PIPE)
stdout = proc.stdout

I could then use select.select on stdout.然后我可以在标准输出上使用 select.select。

See also https://unix.stackexchange.com/questions/25372/另见https://unix.stackexchange.com/questions/25372/

for line in p.stdout:
  ...

always blocks until the next line-feed.总是阻塞直到下一个换行。

For "real-time" behaviour you have to do something like this:对于“实时”行为,您必须执行以下操作:

while True:
  inchar = p.stdout.read(1)
  if inchar: #neither empty string nor None
    print(str(inchar), end='') #or end=None to flush immediately
  else:
    print('') #flush for implicit line-buffering
    break

The while-loop is left when the child process closes its stdout or exits.当子进程关闭其标准输出或退出时,while 循环将被保留。 read()/read(-1) would block until the child process closed its stdout or exited. read()/read(-1)将阻塞,直到子进程关闭其标准输出或退出。

Your problem is:你的问题是:

for line in p.stdout:
    print(">>> " + str(line.rstrip()))
    p.stdout.flush()

the iterator itself has extra buffering.迭代器本身有额外的缓冲。

Try doing like this:尝试这样做:

while True:
  line = p.stdout.readline()
  if not line:
     break
  print line

You cannot get stdout to print unbuffered to a pipe (unless you can rewrite the program that prints to stdout), so here is my solution:你不能让标准输出无缓冲地打印到管道(除非你可以重写打印到标准输出的程序),所以这是我的解决方案:

Redirect stdout to sterr, which is not buffered.将标准输出重定向到未缓冲的 sterr。 '<cmd> 1>&2' should do it. '<cmd> 1>&2'应该这样做。 Open the process as follows: myproc = subprocess.Popen('<cmd> 1>&2', stderr=subprocess.PIPE)打开进程如下: myproc = subprocess.Popen('<cmd> 1>&2', stderr=subprocess.PIPE)
You cannot distinguish from stdout or stderr, but you get all output immediately.您无法区分 stdout 或 stderr,但您会立即获得所有输出。

Hope this helps anyone tackling this problem.希望这可以帮助任何人解决这个问题。

To avoid caching of output you might wanna try pexpect,为了避免缓存输出,您可能想尝试 pexpect,

child = pexpect.spawn(launchcmd,args,timeout=None)
while True:
    try:
        child.expect('\n')
        print(child.before)
    except pexpect.EOF:
        break

PS : I know this question is pretty old, still providing the solution which worked for me. PS :我知道这个问题已经很老了,仍然提供对我有用的解决方案。

PPS : got this answer from another question PPS :从另一个问题得到这个答案

    p = subprocess.Popen(command,
                                bufsize=0,
                                universal_newlines=True)

I am writing a GUI for rsync in python, and have the same probelms.我正在用 python 为 rsync 编写一个 GUI,并且有相同的问题。 This problem has troubled me for several days until i find this in pyDoc.这个问题困扰了我好几天,直到我在 pyDoc 中找到了这个问题。

If universal_newlines is True, the file objects stdout and stderr are opened as text files in universal newlines mode.如果universal_newlines 为True,则文件对象stdout 和stderr 在通用换行符模式下作为文本文件打开。 Lines may be terminated by any of '\\n', the Unix end-of-line convention, '\\r', the old Macintosh convention or '\\r\\n', the Windows convention.行可以由 '\\n'(Unix 行尾约定)、'\\r'(旧的 Macintosh 约定)或 '\\r\\n'(Windows 约定)中的任何一个终止。 All of these external representations are seen as '\\n' by the Python program.所有这些外部表示都被 Python 程序视为“\\n”。

It seems that rsync will output '\\r' when translate is going on.当翻译正在进行时,rsync 似乎会输出 '\\r' 。

Change the stdout from the rsync process to be unbuffered.将 rsync 进程中的 stdout 更改为无缓冲。

p = subprocess.Popen(cmd,
                     shell=True,
                     bufsize=0,  # 0=unbuffered, 1=line-buffered, else buffer-size
                     stdin=subprocess.PIPE,
                     stderr=subprocess.PIPE,
                     stdout=subprocess.PIPE)

I've noticed that there is no mention of using a temporary file as intermediate.我注意到没有提到使用临时文件作为中间文件。 The following gets around the buffering issues by outputting to a temporary file and allows you to parse the data coming from rsync without connecting to a pty.下面通过输出到临时文件来解决缓冲问题,并允许您解析来自 rsync 的数据而无需连接到 pty。 I tested the following on a linux box, and the output of rsync tends to differ across platforms, so the regular expressions to parse the output may vary:我在 linux 机器上测试了以下内容,并且 rsync 的输出往往因平台而异,因此解析输出的正则表达式可能会有所不同:

import subprocess, time, tempfile, re

pipe_output, file_name = tempfile.TemporaryFile()
cmd = ["rsync", "-vaz", "-P", "/src/" ,"/dest"]

p = subprocess.Popen(cmd, stdout=pipe_output, 
                     stderr=subprocess.STDOUT)
while p.poll() is None:
    # p.poll() returns None while the program is still running
    # sleep for 1 second
    time.sleep(1)
    last_line =  open(file_name).readlines()
    # it's possible that it hasn't output yet, so continue
    if len(last_line) == 0: continue
    last_line = last_line[-1]
    # Matching to "[bytes downloaded]  number%  [speed] number:number:number"
    match_it = re.match(".* ([0-9]*)%.* ([0-9]*:[0-9]*:[0-9]*).*", last_line)
    if not match_it: continue
    # in this case, the percentage is stored in match_it.group(1), 
    # time in match_it.group(2).  We could do something with it here...

if you run something like this in a thread and save the ffmpeg_time property in a property of a method so you can access it, it would work very nice I get outputs like this: output be like if you use threading in tkinter如果你在一个线程中运行这样的东西并将 ffmpeg_time 属性保存在一个方法的属性中以便你可以访问它,它会工作得很好我得到这样的输出:输出就像你在 tkinter 中使用线程一样

input = 'path/input_file.mp4'
output = 'path/input_file.mp4'
command = "ffmpeg -y -v quiet -stats -i \"" + str(input) + "\" -metadata title=\"@alaa_sanatisharif\" -preset ultrafast -vcodec copy -r 50 -vsync 1 -async 1 \"" + output + "\""
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, universal_newlines=True, shell=True)
for line in self.process.stdout:
    reg = re.search('\d\d:\d\d:\d\d', line)
    ffmpeg_time = reg.group(0) if reg else ''
    print(ffmpeg_time)

In Python 3, here's a solution, which takes a command off the command line and delivers real-time nicely decoded strings as they are received.在 Python 3 中,这里有一个解决方案,它从命令行中取出一个命令,并在接收到字符串时实时提供经过良好解码的字符串。

Receiver ( receiver.py ):接收器( receiver.py ):

import subprocess
import sys

cmd = sys.argv[1:]
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
for line in p.stdout:
    print("received: {}".format(line.rstrip().decode("utf-8")))

Example simple program that could generate real-time output ( dummy_out.py ):可以生成实时输出( dummy_out.py )的示例简单程序:

import time
import sys

for i in range(5):
    print("hello {}".format(i))
    sys.stdout.flush()  
    time.sleep(1)

Output:输出:

$python receiver.py python dummy_out.py
received: hello 0
received: hello 1
received: hello 2
received: hello 3
received: hello 4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM