简体   繁体   中英

Processing lots of data through a pipe with python / popen

I'm trying to watch a process and wait for a certain pattern, say:

open someFile id=123

then, after that, I want to wait for

close id=123

I tried to write the script as follows:

running_procs = [Popen(["process", "and", "options"], stdout=PIPE, stderr=PIPE)]

while running_procs:
    for proc in running_procs:
        retcode = proc.poll()
        if retcode is not None: # Process finished.
            running_procs.remove(proc)
            break
        else:
            while True:
                next_line = proc.stdout.readline()
                if next_line == '' and proc.poll() != None:
                    break
                m = re.search( r'someFile.*id\=([0-9]*)', next_line, re.M|re.I)
                if m:
                  print m.group(1)

But it seems to be performing way too slow. Any suggestions on handling a lot of lines in a pipe? Is there a faster way?

There is nothing in this specific example to indicate that it should be slow simply on account of the code. With only a single process in your list, its going to read lines as fast as the process makes them available. This means the code is going to be dependent on the subprocess flushing its output and making lines available. But thats really to be expected.

Since you are going to always be reading line by line, you might want to set bufsize=1 in your Popen, to make sure its always line buffered:

Popen(["process"], stdout=PIPE, stderr=PIPE, bufsize=1)

I do however see an issue if you intend to be running multiple processes, as indicated by the fact that you are using a list of possible processes and popping dead ones from them. Your loop is going to block on one process at a time. If this is not your intention to have the processes read serially, then it will be a cause of a slow down in how you get your data back. They run in parallel, but only be monitored serially.

Aside from this, you will have to go into more detail about why you consider the results to be slow and what you expect to happen.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM