简体   繁体   English

僵尸状态多处理库python3

[英]Zombie state multiprocessing library python3

My question concerns a replacement of join() function to avoid a defunct or zombie state of already terminated processes when using the multiprocessing library of python3.我的问题涉及在使用 python3 的多处理库时替换 join() 函数以避免已终止进程的失效或僵尸状态。 Is there an alternative which may suspend the child processes from being terminated until they get the green light from the main process?是否有替代方法可以暂停子进程被终止,直到它们从主进程获得绿灯? This allows them to terminate correctly without going into a zombie state?这允许他们在不进入僵尸状态的情况下正确终止?

I prepared a quick illustration using the following code which launches 20 different processes, the first process takes 10 seconds work of load and all others take 3 seconds work of load:我使用以下代码准备了一个快速说明,它启动了 20 个不同的进程,第一个进程需要 10 秒的加载工作,所有其他进程需要 3 秒的加载工作:

import os
import sys
import time
import multiprocessing as mp
from multiprocessing import Process

def exe(i):
    print(i)    
    if i == 1:
        time.sleep(10)
    else:
        time.sleep(3)
procs = []
for i in range(1,20):
    proc = Process(target=exe, args=(i,))
    proc.start()
    procs.append(proc)

for proc in procs:
    print(proc) # <-- I'm blocked to join others till the first process finishes its work load
    proc.join()

print("finished")

If you launch the script, you will see that all the other processes go to into a zombie state until the join() function is released from the first process.如果您启动脚本,您将看到所有其他进程都进入僵尸状态,直到第一个进程释放 join() 函数。 This could make the system unstable or overloaded!这可能会使系统不稳定或过载!

Thanks谢谢

Per this thread , Marko Rauhamaa writes:根据此线程,Marko Rauhamaa 写道:

If you don't care to know when child processes exit, you can simply ignore the SIGCHLD signal:如果您不想知道子进程何时退出,您可以简单地忽略 SIGCHLD 信号:

 import signal signal.signal(signal.SIGCHLD, signal.SIG_IGN)

That will prevent zombies from appearing.这将防止僵尸出现。

The wait(2) man page explains: wait(2)手册页解释了:

POSIX.1-2001 specifies that if the disposition of SIGCHLD is set to SIG_IGN or the SA_NOCLDWAIT flag is set for SIGCHLD (see sigaction(2)), then children that terminate do not become zombies and a call to wait() or waitpid() will block until all children have terminated, and then fail with errno set to ECHILD. POSIX.1-2001 规定,如果 SIGCHLD 的处置设置为 SIG_IGN 或为 SIGCHLD 设置了 SA_NOCLDWAIT 标志(请参阅 sigaction(2)),则终止的子进程不会变成僵尸,并且调用 wait() 或 waitpid( ) 将阻塞,直到所有子进程都终止,然后失败,将 errno 设置为 ECHILD。 (The original POSIX standard left the behavior of setting SIGCHLD to SIG_IGN unspecified. Note that even though the default disposition of SIGCHLD is "ignore", explicitly setting the disposition to SIG_IGN results in different treatment of zombie process children.) (原始 POSIX 标准未指定将 SIGCHLD 设置为 SIG_IGN 的行为。请注意,即使 SIGCHLD 的默认处置是“忽略”,明确地将处置设置为 SIG_IGN 会导致对僵尸进程子进程的不同处理。)

Linux 2.6 conforms to the POSIX requirements. Linux 2.6 符合 POSIX 要求。 However, Linux 2.4 (and earlier) does not: if a wait() or waitpid() call is made while SIGCHLD is being ignored, the call behaves just as though SIGCHLD were not being ignored, that is, the call blocks until the next child terminates and then returns the process ID and status of that child.但是,Linux 2.4(及更早版本)不会:如果在忽略 SIGCHLD 时进行了 wait() 或 waitpid() 调用,则该调用的行为就像未忽略 SIGCHLD 一样,也就是说,调用会阻塞直到下一个child 终止,然后返回该 child 的进程 ID 和状态。

So if you are using Linux 2.6 or a POSIX-compliant OS, using the above code will allow children processes to exit without becoming zombies.因此,如果您使用的是 Linux 2.6 或兼容 POSIX 的操作系统,则使用上述代码将允许子进程退出而不会成为僵尸进程。 If you are not using a POSIX-compliant OS, then the thread above offers a number of options.如果您使用的不是 POSIX 兼容的操作系统,那么上面的线程提供了许多选项。 Below is one alternative, somewhat similar to Marko Rauhamaa's third suggestion .下面是一种替代方案,有点类似于 Marko Rauhamaa 的第三个建议


If for some reason you need to know when children processes exit and wish to handle (at least some of them) differently, then you could set up a queue to allow the child processes to signal the main process when they are done.如果由于某种原因您需要知道子进程何时退出并希望以不同方式处理(至少其中一些),那么您可以设置一个队列以允许子进程在完成时向主进程发出信号。 Then the main process can call the appropriate join in the order in which it receives items from the queue:然后主进程可以按照它从队列接收项目的顺序调用适当的连接:

import time
import multiprocessing as mp

def exe(i, q):
    try:
        print(i)    
        if i == 1:
            time.sleep(10)
        elif i == 10:
            raise Exception('I quit')
        else:
            time.sleep(3)
    finally:
        q.put(mp.current_process().name)

if __name__ == '__main__':
    procs = dict()
    q = mp.Queue()
    for i in range(1,20):
        proc = mp.Process(target=exe, args=(i, q))
        proc.start()
        procs[proc.name] = proc

    while procs:
        name = q.get()
        proc = procs[name]
        print(proc) 
        proc.join()
        del procs[name]

    print("finished")

yields a result like产生类似的结果

...    
<Process(Process-10, stopped[1])>  # <-- process with exception still gets joined
19
<Process(Process-2, started)>
<Process(Process-4, stopped)>
<Process(Process-6, started)>
<Process(Process-5, stopped)>
<Process(Process-3, stopped)>
<Process(Process-9, started)>
<Process(Process-7, stopped)>
<Process(Process-8, started)>
<Process(Process-13, started)>
<Process(Process-12, stopped)>
<Process(Process-11, stopped)>
<Process(Process-16, started)>
<Process(Process-15, stopped)>
<Process(Process-17, stopped)>
<Process(Process-14, stopped)>
<Process(Process-18, started)>
<Process(Process-19, stopped)>
<Process(Process-1, started)>      # <-- Process-1 ends last
finished

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM