多处理不会从 Linux 池中加入进程

Question

我在 python 3.6 中有一个简化的程序，它运行多个线程。 然后每个线程生成一个进程池并并行运行一个作业。

现在由于某种原因，代码在 Windows 上运行良好，但在几个周期后，在 Linux 上挂起。 这可能是由于 Linux 使用 fork 来创建新进程，而不是 spawn。 可以更改行为，但生成进程对于我的需要来说太慢了。

这是代码：

import time
import random
from threading import Thread
from multiprocessing import Pool
 
NUM_PROCESSES = 2
NUM_TRIALS = 3
random.seed(42)

 
class TestPool:
    def process(self):
        opt_profiles = ['P1', 'P2', 'P3', 'P4']
        input_data = [1, 2, 3]
 
        for _ in range(100):
            threads = []
            try:
                for opt_profile in opt_profiles:
                    thread = self.process_async(opt_profile, input_data)
                    threads.append(thread)
            finally:
                print('Waiting for threads to be finished')
                for thread in threads:
                    thread.join()
                print('Threads are finished')
 
    def process_async(self, opt_profile, input_data):
        thread = Thread(target=self._process_asynch_pool, args=[opt_profile, input_data])
        thread.start()
        return thread
 
    def _process_asynch_pool(self, opt_profile, input_data):
        print(f'Processing profile: {opt_profile}, data: {input_data}')
        p = Pool(NUM_PROCESSES)
        print(f'Running profile: {opt_profile}')
        processed_data = p.map(self._process_asynch_data, input_data)
        print(f'Closing profile: {opt_profile}')
        p.close()
        print(f'Joining profile: {opt_profile}')
        p.join()
        print(f'Processes have joined.')
  
    def _process_asynch_data(self, input_data):
        print(f'Received data: {input_data}')
        result = input_data * 10
        time.sleep(1)
        return result
 
 
if __name__ == '__main__':
    pool = TestPool()
    pool.process()

代码挂在thread.join()行，但日志表明每个进程都已完成其工作。

编辑：除了原始系统（CentOS、Python3.6）之外，我还可以使用 Python3.8 在 Ubuntu（WSL）上重现该问题。

Answer 1

multiprocessing.pool.Pool实例的文档指出：

警告： multiprocessing.pool对象具有需要通过将池用作上下文管理器或手动调用 close() 和 terminate() 来正确管理的内部资源（与任何其他资源一样）。 不这样做可能会导致进程挂起。

请注意，依赖垃圾收集器来销毁池是不正确的，因为 CPython 不保证会调用池的终结器（有关更多信息，请参阅object.__del__() ）。

我同意其他评论者的观点，我希望代码按原样运行。 但是您永远不会显式或隐式调用terminate （如果您将池实例用作上下文管理器，就会出现这种情况）并且您正在多次重新创建池。 要尝试的一件事是添加对pool.terminate()的调用，而不是添加对 pool.join( pool.join()的调用，看看是否可以解决问题。

出于诊断目的值得尝试的另一件事（除了它会更有效）是在一开始就创建一个单一的多处理池，理想情况下大到足以处理将在每次试验迭代中提交给它的所有任务。 由于工作函数_process_asynch_data主要是因为调用time.sleep而处于等待状态，因此创建大于您拥有的内核数的多处理池没有问题； 这实际上是可取的。 您还必须将多处理池作为参数传递给您的线程。 例如：

import time
from threading import Thread
from multiprocessing import Pool

NUM_TRIALS = 100
SLEEP_TIME = 1

class TestPool:
    def process(self):
        opt_profiles = ['P1', 'P2', 'P3', 'P4']
        input_data = [1, 2, 3]

        tasks_per_trial = len(opt_profiles) * len(input_data)
        with Pool(tasks_per_trial) as pool:
            for _ in range(NUM_TRIALS):
                threads = []
                try:
                    for opt_profile in opt_profiles:
                        thread = self.process_async(pool, opt_profile, input_data)
                        threads.append(thread)
                finally:
                    print('Waiting for threads to be finished')
                    for thread in threads:
                        thread.join()
                    print('Threads are finished')
                    print(f'Closing profile: {opt_profile}')
            print(f'Closing pool.')
            pool.close()
            print(f'Joining pool.')
            pool.join()
            print(f'Pool has been joined.')
        # implicit pool.terminate() is done here:


    def process_async(self, pool, opt_profile, input_data):
        thread = Thread(target=self._process_asynch_pool, args=[pool, opt_profile, input_data])
        thread.start()
        return thread

    def _process_asynch_pool(self, pool, opt_profile, input_data):
        print(f'Processing profile: {opt_profile}, data: {input_data}')
        print(f'Running profile: {opt_profile}')
        processed_data = pool.map(self._process_asynch_data, input_data)

    def _process_asynch_data(self, input_data):
        print(f'Received data: {input_data}')
        result = input_data * 10
        time.sleep(SLEEP_TIME)
        return result


if __name__ == '__main__':
    pool = TestPool()
    pool.process()

多处理不会从 Linux 池中加入进程

问题描述

1 个解决方案

解决方案1
1 2022-05-26 21:14:43

多处理不会从 Linux 池中加入进程

问题描述

1 个解决方案

解决方案1 1 2022-05-26 21:14:43

解决方案1
1 2022-05-26 21:14:43