简体   繁体   English

从破损的管道读取时,管道Python脚本占用100%的CPU

[英]Piped Python script takes 100% of CPU when reading from broken pipe

I have two Python scripts running on an Ubuntu Linux machine. 我在Ubuntu Linux机器上运行了两个Python脚本。 The 1st one sends all its output into stdout, the second one reads from stdin. 第一个将其所有输出发送到stdout,第二个从stdin读取。 They are connected by a simple pipe, ie something like this: 它们通过简单的管道连接,即:

./step1.py <some_args> | ./step2.py <some_other_args>

What step2 does is that it reads lines of input in an infinite loop and processes them: step2的作用是它在无限循环中读取输入行并处理它们:

while True:
    try:
        l = sys.stdin.readline()
        # processing here

Step1 crashes from time to time. Step1不时崩溃。 When that happens (not sure if always but at least on several occasions) is that instead of crashing/stopping, step2 goes crazy and starts taking 100% of the CPU until I manually kill it. 当发生这种情况时(不确定是否总是,但至少在几次)是不是崩溃/停止,step2变得疯狂并开始占用100%的CPU,直到我手动杀死它。

Why is this happening and how can I make step2 more robust so that it stops when the pipe is broken? 为什么会发生这种情况?如何使step2更加强大,以便在管道损坏时停止?

Thanks! 谢谢!

When step1 dies, you have a while loop with a try on a statement that will throw an exception. 当step1死掉时,你有一个while循环,尝试一个会抛出异常的语句。 Thus you'll continuously try and fail using 100% of the CPU as readline won't block when it's throwing an exception. 因此,您将不断尝试并使用100%的CPU失败,因为readline在抛出异常时不会阻塞。

Either add a time delay to reading with time.sleep or, even better, pay attention to the errors readline is throwing and catch the specific error that is thrown when step1 stops and quit the program instead of trying to read from a dead pipe. 无论是时间延迟添加到与阅读time.sleep ,或者甚至更好,注意错误的ReadLine被抛钓时引发第一步时停止特定的错误和退出程序,而不是试图从死管阅读。

You probably want a sleep operator when the pipe is empty and an exit when the pipe dies, but which exception is thrown with what message in each case I leave as an exercise for you to determine. 当管道为空时你可能想要一个睡眠操作符,而当管道死亡时你可能想要一个退出,但是在每种情况下我抛出哪个消息作为练习让你确定。 The sleep operator isn't necessary in such a situation but it will avoid other situations where you can hit high CPU usage on useless work. 在这种情况下,睡眠操作员不是必需的,但它可以避免在无用的工作中遇到高CPU使用率的其他情况。

Others already explained why you end up in an endless loop in certain cases. 其他人已经解释了为什么在某些情况下你会陷入无休止的循环。

In the second (reading) script, you can use the idiom: 在第二个(阅读)脚本中,您可以使用成语:

for line in sys.stdin:
    process(line)

This way you will not end up in an endless loop. 这样你就不会陷入无尽的循环。 Furthermore, you did not actually show which exception you try to catch in the second script, but I guess that from time to time you'll experience a 'broken pipe' error, which you can and should catch as described here: How to handle a broken pipe (SIGPIPE) in python? 此外,你实际上没有显示你试图在第二个脚本中捕获哪个异常,但我想你会不时遇到一个“破管”错误,你可以并且应该按照这里描述的那样捕获: 如何处理python中的破管(SIGPIPE)?

The whole scheme then could look like this: 整个方案可能如下所示:

try:
    for line in sys.stdin:
        process(line)
except IOError, e:
    if e.errno == errno.EPIPE:
        # EPIPE error
    else:
        # Other error

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM