简体   繁体   English

解析pexpect输出

[英]Parsing pexpect output

I'm trying to parse in real time the output of a program block-buffered, which means that output is not available until the process ends. 我正在尝试实时解析程序块缓冲的输出,这意味着在进程结束之前输出不可用。 What I need is just to parse line by line, filter and manage data from the output, as it could run for hours. 我需要的是逐行解析,过滤和管理输出中的数据,因为它可以运行几个小时。

I've tried to capture the output with subprocess.Popen(), but yes, as you may guess, Popen can't manage this kind of behavior, it keeps buffering until end of process. 我试图用subprocess.Popen()来捕获输出,但是,正如你可能猜到的那样,Popen无法管理这种行为,它会一直缓冲直到进程结束。

from subprocess import Popen, PIPE

p = Popen("my noisy stuff ", shell=True, stdout=PIPE, stderr=PIPE)
for line in p.stdout.readlines():
    #parsing text and getting data

So I found pexpect, which prints the output in real time, as it treats the stdout as a file, or I could even do a dirty trick printing out a file and parsing it outside the function. 所以我找到了pexpect,它实时打印输出,因为它将stdout视为文件,或者我甚至可以做一个脏技巧打印出文件并在函数外解析它。 But ok, it is too dirty, even for me ;) 但好吧,它太脏了,即使对我来说也是如此;)

import pexpect
import sys

pexpect.run("my noisy stuff", logfile=sys.stdout)

But I guess it should a better pythonic way to do this, just manage the stdout like subprocess. 但我想它应该是一个更好的pythonic方式来做到这一点,只需像子进程一样管理stdout。 Popen does. Popen。 How can I do this? 我怎样才能做到这一点?

EDIT: 编辑:

Running JF proposal: 运行JF提案:

This is a deliberately wrong audit, it takes about 25 secs. 这是一次故意错误的审计,大约需要25秒。 to stop. 停止。

from subprocess import Popen, PIPE

command = "bully mon0 -e ESSID -c 8 -b aa:bb:cc:dd:ee:00 -v 2"

p = Popen(command, shell=True, stdout=PIPE, stderr=PIPE)

for line in iter(p.stdout.readline, b''):
    print "inside loop"
    print line

print "outside loop"
p.stdout.close()
p.wait()


#$ sudo python SCRIPT.py
                                ### <= 25 secs later......
# inside loop
#[!] Bully v1.0-21 - WPS vulnerability assessment utility

#inside loop
#[!] Using 'ee:cc:bb:aa:bb:ee' for the source MAC address

#inside loop
#[X] Unable to get a beacon from the AP, possible causes are

#inside loop
#[.]    an invalid --bssid or -essid was provided,

#inside loop
#[.]    the access point isn't on channel '8',

#inside loop
#[.]    you aren't close enough to the access point.

#outside loop

Using this method instead: EDIT: Due to large delays and timeouts in the output, I had to fix the child, and added some hacks, so final code looks like this 改为使用这种方法:编辑:由于输出中有大的延迟和超时,我不得不修复孩子,并添加了一些黑客,所以最终代码看起来像这样

import pexpect

child = pexpect.spawn(command)
child.maxsize = 1  #Turns off buffering
child.timeout = 50 # default is 30, insufficient for me. Crashes were due to this param.
for line in child:
    print line,

child.close()

Gives back the same output, but it prints lines in real time. 返回相同的输出,但它实时打印行。 So... SOLVED Thanks @JF Sebastian 所以...已解决了谢谢@JF Sebastian

.readlines() reads all lines. .readlines()读取所有行。 No wonder you don't see any output until the subprocess ends. 难怪在子进程结束之前你没有看到任何输出。 You could use .readline() instead to read line by line as soon as the subprocess flushes its stdout buffer: 一旦子.readline()刷新其stdout缓冲区,您就可以使用.readline().readline()读取:

from subprocess import Popen, PIPE

p = Popen("my noisy stuff", stdout=PIPE, bufsize=1)
for line in iter(p.stdout.readline, b''):
    # process line
    ..
p.stdout.close()
p.wait()

If you are already have pexpect then you could use it to workaround the block-buffering issue: 如果您已经有了pexpect那么您可以使用它来解决块缓冲问题:

import pexpect

child = pexpect.spawn("my noisy stuff", timeout=None)
for line in child: 
    # process line
    ..
child.close()

See also stdbuf , pty -based solutions from the question I've linked in the comments. 另请参阅我在评论中链接的问题中的stdbuf ,基于pty的解决方案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM