简体   繁体   English

使用os.pipe和os.fork()问题的Python程序

[英]Python program using os.pipe and os.fork() issue

I've recently needed to write a script that performs an os.fork() to split into two processes. 我最近需要编写一个执行os.fork()的脚本来分成两个进程。 The child process becomes a server process and passes data back to the parent process using a pipe created with os.pipe() . 子进程成为服务器进程,并使用os.pipe()创建的管道将数据传递回父进程。 The child closes the 'r' end of the pipe and the parent closes the 'w' end of the pipe, as usual. 孩子关闭管道的'r'端,父母像往常一样关闭管道的'w'端。 I convert the returns from pipe() into file objects with os.fdopen . 我将使用os.fdopen将pipe()的返回值转换为文件对象。

The problem I'm having is this: The process successfully forks, and the child becomes a server. 我遇到的问题是:流程成功分叉,孩子成为服务器。 Everything works great and the child dutifully writes data to the open 'w' end of the pipe. 一切都很好,孩子尽职尽责地将数据写入管道的开放'w'端。 Unfortunately the parent end of the pipe does two strange things: 不幸的是,管道的父端做了两件奇怪的事情:
A) It blocks on the read() operation on the 'r' end of the pipe. A)它阻塞管道'r'端的read()操作。
Secondly, it fails to read any data that was put on the pipe unless the 'w' end is entirely closed. 其次,除非'w'端完全关闭,否则它无法读取放在管道上的任何数据。

I immediately thought that buffering was the problem and added pipe.flush() calls, but these didn't help. 我立即认为缓冲是问题,并添加了pipe.flush()调用,但这些没有帮助。

Can anyone shed some light on why the data doesn't appear until the writing end is fully closed? 任何人都可以解释为什么数据在写入结束完全关闭之前不会出现? And is there a strategy to make the read() call non blocking? 是否存在使read()调用无阻塞的策略?

This is my first Python program that forked or used pipes, so forgive me if I've made a simple mistake. 这是我的第一个分叉或使用管道的Python程序,如果我犯了一个简单的错误,请原谅我。

Are you using read() without specifying a size, or treating the pipe as an iterator ( for line in f )? 您是在不指定大小的情况下使用read(),还是将管道视为迭代器( for line in f )? If so, that's probably the source of your problem - read() is defined to read until the end of the file before returning, rather than just read what is available for reading. 如果是这样,那可能是你的问题的根源 - read()被定义为在返回之前读取文件的末尾,而不是只读取可用于读取的内容。 That will mean it will block until the child calls close(). 这意味着它将阻塞,直到孩子调用close()。

In the example code linked to, this is OK - the parent is acting in a blocking manner, and just using the child for isolation purposes. 在链接到的示例代码中,这没关系 - 父进程以阻塞方式运行,并且只是将子进程用于隔离目的。 If you want to continue, then either use non-blocking IO as in the code you posted (but be prepared to deal with half-complete data), or read in chunks (eg r.read(size) or r.readline()) which will block only until a specific size / line has been read. 如果你想继续,那么要么使用非阻塞IO,就像你发布的代码一样(但是要准备处理半完成数据),或者读取块(例如r.read(size)或r.readline()) )只会在读取特定大小/行之前阻止。 (you'll still need to call flush on the child) (你还需要给孩子打电话)

It looks like treating the pipe as an iterator is using some further buffer as well, for " for line in r: " may not give you what you want if you need each line to be immediately consumed. 看起来将管道视为迭代器也使用了一些进一步的缓冲区,因为“ for line in r: ”如果需要立即使用每一行,可能无法满足您的需求。 It may be possible to disable this, but just specifying 0 for the buffer size in fdopen doesn't seem sufficient. 有可能禁用它,但只是在fdopen中为缓冲区大小指定0似乎不够。

Heres some sample code that should work: 下面是一些应该有效的示例代码:

import os, sys, time

r,w=os.pipe()
r,w=os.fdopen(r,'r',0), os.fdopen(w,'w',0)

pid = os.fork()
if pid:          # Parent
    w.close()
    while 1:
        data=r.readline()
        if not data: break
        print "parent read: " + data.strip()
else:           # Child
    r.close()
    for i in range(10):
        print >>w, "line %s" % i
        w.flush()
        time.sleep(1)

Using 运用

fcntl.fcntl(readPipe, fcntl.F_SETFL, os.O_NONBLOCK)

Before invoking the read() solved both problems. 在调用read()之前解决了这两个问题。 The read() call is no longer blocking and the data is appearing after just a flush() on the writing end. read()调用不再阻塞,并且数据仅在写入端的flush()之后出现。

I see you have solved the problem of blocking i/o and buffering. 我看到你已经解决了阻止i / o和缓冲的问题。

A note if you decide to try a different approach: subprocess is the equivalent / a replacement for the fork/exec idiom. 如果您决定尝试不同的方法,请注意:subprocess是fork / exec习惯用语的等效/替换。 It seems like that's not what you're doing: you have just a fork (not an exec) and exchanging data between the two processes -- in this case the multiprocessing module (in Python 2.6+) would be a better fit. 看起来这不是你正在做的事情:你只有一个fork(而不是exec)并在两个进程之间交换数据 - 在这种情况下, multiprocessing模块(在Python 2.6+中)将更适合。

The "parent" vs. "child" part of fork in a Python application is silly. Python应用程序中fork的“父”与“子”部分是愚蠢的。 It's a legacy from 16-bit unix days. 这是16位unix时代的遗产。 It's an affectation from a day when fork/exec and exec were Important Things to make the most of a tiny little processor. 从fork / exec和exec是重要事物的那一天起,这是一种做法,可以充分利用一个小小的处理器。

Break your Python code into two separate parts: parent and child. 将Python代码分成两个独立的部分:父级和子级。

The parent part should use subprocess to run the child part. 父部件应使用子进程来运行子部件。

A fork and exec may happen somewhere in there -- but you don't need to care. fork和exec可能发生在那里的某个地方 - 但你不需要关心。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM