简体   繁体   English

如何使用子进程模块杀死(或避免)僵尸进程

[英]how to kill (or avoid) zombie processes with subprocess module

When I kick off a python script from within another python script using the subprocess module, a zombie process is created when the subprocess "completes".当我使用子进程模块从另一个 python 脚本中启动一个 python 脚本时,当子进程“完成”时会创建一个僵尸进程。 I am unable to kill this subprocess unless I kill my parent python process.除非我杀死我的父 python 进程,否则我无法杀死这个子进程。

Is there a way to kill the subprocess without killing the parent?有没有办法在不杀死父进程的情况下杀死子进程? I know I can do this by using wait(), but I need to run my script with no_wait().我知道我可以使用 wait() 来做到这一点,但我需要使用 no_wait() 来运行我的脚本。

A zombie process is not a real process;僵尸进程不是真正的进程; it's just a remaining entry in the process table until the parent process requests the child's return code.它只是进程表中的剩余条目,直到父进程请求子进程的返回代码。 The actual process has ended and requires no other resources but said process table entry.实际进程已经结束,除了所述进程表条目外不需要其他资源。

We probably need more information about the processes you run in order to actually help more.我们可能需要更多有关您运行的流程的信息才能真正提供更多帮助。

However, in the case that your Python program knows when the child processes have ended (eg by reaching the end of the child stdout data), then you can safely call process.wait() :但是,如果您的 Python 程序知道子进程何时结束(例如到达子标准输出数据的末尾),那么您可以安全地调用process.wait()

import subprocess

process= subprocess.Popen( ('ls', '-l', '/tmp'), stdout=subprocess.PIPE)

for line in process.stdout:
        pass

subprocess.call( ('ps', '-l') )
process.wait()
print "after wait"
subprocess.call( ('ps', '-l') )

Example output:示例输出:

$ python so2760652.py
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S   501 21328 21326  0  80   0 -  1574 wait   pts/2    00:00:00 bash
0 S   501 21516 21328  0  80   0 -  1434 wait   pts/2    00:00:00 python
0 Z   501 21517 21516  0  80   0 -     0 exit   pts/2    00:00:00 ls <defunct>
0 R   501 21518 21516  0  80   0 -   608 -      pts/2    00:00:00 ps
after wait
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S   501 21328 21326  0  80   0 -  1574 wait   pts/2    00:00:00 bash
0 S   501 21516 21328  0  80   0 -  1467 wait   pts/2    00:00:00 python
0 R   501 21519 21516  0  80   0 -   608 -      pts/2    00:00:00 ps

Otherwise, you can keep all the children in a list, and now and then .poll for their return codes.否则,您可以将所有子项保留在一个列表中,然后.poll以获取他们的返回代码。 After every iteration, remember to remove from the list the children with return codes different than None (ie the finished ones).每次迭代后,记住从列表中删除返回码不同于None (即完成的子项)。

Not using Popen.communicate() or call() will result in a zombie process.不使用Popen.communicate()call()将导致僵尸进程。

If you don't need the output of the command, you can use subprocess.call() :如果不需要命令的输出,可以使用subprocess.call()

>>> import subprocess
>>> subprocess.call(['grep', 'jdoe', '/etc/passwd'])
0

If the output is important, you should use Popen() and communicate() to get the stdout and stderr.如果输出很重要,您应该使用Popen()communicate()来获取标准输出和标准错误。

>>> from subprocess import Popen, PIPE
>>> process = Popen(['ls', '-l', '/tmp'], stdout=PIPE, stderr=PIPE)
>>> stdout, stderr = process.communicate()
>>> stderr
''
>>> print stdout
total 0
-rw-r--r-- 1 jdoe jdoe 0 2010-05-03 17:05 bar
-rw-r--r-- 1 jdoe jdoe 0 2010-05-03 17:05 baz
-rw-r--r-- 1 jdoe jdoe 0 2010-05-03 17:05 foo

If you delete the subprocess object, using del to force garbage collection, that will cause the subprocess object to be deleted and then the defunct processes will go away without terminating your interpreter.如果您删除子进程对象,使用del强制垃圾收集,这将导致子进程对象被删除,然后失效的进程将消失而不终止您的解释器。 You can try this out in the python command line interface first.您可以先在 python 命令行界面中尝试一下。

If you simply use subprocess.Popen , you'll be fine - here's how:如果你只是使用subprocess.Popen ,你会没事的 - 方法如下:

import subprocess

def spawn_some_children():
    subprocess.Popen(["sleep", "3"])
    subprocess.Popen(["sleep", "3"])
    subprocess.Popen(["sleep", "3"])

def do_some_stuff():
    spawn_some_children()
    # do some stuff
    print "children went out to play, now I can do my job..."
    # do more stuff

if __name__ == '__main__':
    do_some_stuff()

You can use .poll() on the object returned by Popen to check whether it finished (without waiting).您可以在.poll()返回的对象上使用.poll()来检查它是否完成(无需等待)。 If it returns None , the child is still running.如果它返回None ,则孩子仍在运行。

Make sure you don't keep references to the Popen objects - if you do, they will not be garbage collected, so you end up with zombies.确保您不保留对 Popen 对象的引用 - 如果这样做,它们将不会被垃圾收集,因此您最终会遇到僵尸。 Here's an example:下面是一个例子:

import subprocess

def spawn_some_children():
    children = []
    children.append(subprocess.Popen(["sleep", "3"]))
    children.append(subprocess.Popen(["sleep", "3"]))
    children.append(subprocess.Popen(["sleep", "3"]))
    return children

def do_some_stuff():
    children = spawn_some_children()
    # do some stuff
    print "children went out to play, now I can do my job..."
    # do more stuff

    # if children finish while we are in this function,
    # they will become zombies - because we keep a reference to them

In the above example, if you want to get rid of the zombies, you can either .wait() on each of the children or .poll() until the result is not None .在上面的例子中,如果你想摆脱僵尸,你可以在每个孩子上使用.wait()或者.poll()直到结果不是None

Either way is fine - either not keeping references, or using .wait() or .poll() .无论哪种方式都可以 - 要么不保留引用,要么使用.wait().poll()

The python runtime takes responsibility for getting rid of zombie process once their process objects have been garbage collected.一旦进程对象被垃圾回收,python 运行时负责清除僵尸进程。 If you see the zombie lying around it means you have kept a process object and not called wait, poll or terminate on it.如果您看到僵尸躺在它周围,则意味着您保留了一个进程对象,并且没有在其上调用等待、轮询或终止。

I'm not sure what you mean "I need to run my script with no_wait()", but I think this example does what you need.我不确定你的意思是“我需要用 no_wait() 运行我的脚本”,但我认为这个例子可以满足你的需求。 Processes will not be zombies for very long.进程不会长时间处于僵尸状态。 The parent process will only wait() on them when they are actually already terminated and thus they will quickly unzombify.父进程只会在它们实际上已经终止时才会对它们进行wait()处理,因此它们会迅速解冻。

#!/usr/bin/env python2.6
import subprocess
import sys
import time

children = []
#Step 1: Launch all the children asynchronously
for i in range(10):
    #For testing, launch a subshell that will sleep various times
    popen = subprocess.Popen(["/bin/sh", "-c", "sleep %s" % (i + 8)])
    children.append(popen)
    print "launched subprocess PID %s" % popen.pid

#reverse the list just to prove we wait on children in the order they finish,
#not necessarily the order they start
children.reverse()
#Step 2: loop until all children are terminated
while children:
    #Step 3: poll all active children in order
    children[:] = [child for child in children if child.poll() is None]
    print "Still running: %s" % [popen.pid for popen in children]
    time.sleep(1)

print "All children terminated"

The output towards the end looks like this:最后的输出如下所示:

Still running: [29776, 29774, 29772]
Still running: [29776, 29774]
Still running: [29776]
Still running: []
All children terminated

Like this:像这样:
s = Popen(args)
s.terminate()
time.sleep(0.5)
s.poll()

it works有用
zombie processes will disappear僵尸进程将消失

I'm not entirely sure what you mean by no_wait() .我不完全确定您所说的no_wait()是什么意思。 Do you mean you can't block waiting for child processes to finish?你的意思是你不能阻止等待子进程完成? Assuming so, I think this will do what you want:假设是这样,我认为这会做你想要的:

os.wait3(os.WNOHANG)

Recently, I came across this zombie problem due to my python script.最近,由于我的python脚本,我遇到了这个僵尸问题。 The actual problem was mainly due to killing of the subprocess and the parent process doesn't know that the child is dead.实际问题主要是由于子进程被杀死而父进程不知道子进程已死。 So what I did was, just adding popen.communicate() after the kill signal of child process so that the parent process comes to know that the child is dead, then the kernel updates the pid of the childprocess since the child is no more and so there is no zombies formed now.所以我所做的是,只是在子进程的终止信号之后添加 popen.communicate() 以便父进程知道子进程已经死了,然后内核更新子进程的pid,因为子进程不再存在并且所以现在没有僵尸形成。

PS:poll is also an option here since it checks and conveys about the child status to the parent. PS:poll 在这里也是一个选项,因为它会检查子状态并将其传达给父​​级。 Often in subprocess it is better if u use check_output or call if u need not communicate with the stdout and stdin.通常在子进程中,如果您不需要与 stdout 和 stdin 通信,则最好使用 check_output 或调用。

When you don't need to wait for any subprocesses you spawned, the simplest solution to prevent zombie processes is to call signal(SIGCHLD, SIG_IGN);当您不需要等待您生成的任何子进程时,防止僵尸进程的最简单解决方案是调用signal(SIGCHLD, SIG_IGN); during initialization.在初始化期间。 Then, terminated subprocesses are deleted immediately .然后,立即删除终止的子进程。 This setting applies to the whole process, so you can only use it if there isn't any child you need to wait for.此设置适用于整个过程,因此您只能在没有需要等待的孩子时使用它。

In Python:在 Python:

import signal
signal.signal(signal.SIGCHLD, signal.SIG_IGN)
# …
# call subprocess.Popen(…) as needed

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM