使用啟動方法“spawn”的 Python 多處理不起作用

Question

我編寫了一個 Python 類來並行繪制 pylots。 它在默認啟動方法是 fork 的 Linux 上運行良好，但是當我在 Windows 上嘗試時遇到了問題（可以使用 spawn start 方法在 Linux 上重現 - 請參閱下面的代碼）。 我總是最終收到此錯誤：

Traceback (most recent call last):
  File "test.py", line 50, in <module>
    test()
  File "test.py", line 7, in test
    asyncPlotter.saveLinePlotVec3("test")
  File "test.py", line 41, in saveLinePlotVec3
    args=(test, ))
  File "test.py", line 34, in process
    p.start()
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle weakref objects

C:\Python\MonteCarloTools>Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 99, in spawn_main
    new_handle = reduction.steal_handle(parent_pid, pipe_handle)
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\reduction.py", line 82, in steal_handle
    _winapi.PROCESS_DUP_HANDLE, False, source_pid)
OSError: [WinError 87] The parameter is incorrect

我希望有一種方法可以使此代碼適用於 Windows。 這是 Linux 和 Windows 上可用的不同啟動方法的鏈接： https : //docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

import multiprocessing as mp
def test():

    manager = mp.Manager()
    asyncPlotter = AsyncPlotter(manager.Value('i', 0))

    asyncPlotter.saveLinePlotVec3("test")
    asyncPlotter.saveLinePlotVec3("test")

    asyncPlotter.join()


class AsyncPlotter():

    def __init__(self, nc, processes=mp.cpu_count()):

        self.nc = nc
        self.pids = []
        self.processes = processes


    def linePlotVec3(self, nc, processes, test):

        self.waitOnPool(nc, processes)

        print(test)

        nc.value -= 1


    def waitOnPool(self, nc, processes):

        while nc.value >= processes:
            time.sleep(0.1)
        nc.value += 1


    def process(self, target, args):

        ctx = mp.get_context('spawn') 
        p = ctx.Process(target=target, args=args)
        p.start()
        self.pids.append(p)


    def saveLinePlotVec3(self, test):

        self.process(target=self.linePlotVec3,
                       args=(self.nc, self.processes, test))


    def join(self):
        for p in self.pids:
            p.join()


if __name__=='__main__':
    test()

Answer 1

使用spawn start 方法時， Process對象本身被腌制以供子進程使用。 在您的代碼中， target=target參數是AsyncPlotter的綁定方法。 看起來整個asyncPlotter實例也必須被腌制才能工作，其中包括self.manager ，它顯然不想被腌制。

簡而言之，將Manager放在AsyncPlotter之外。 這適用於我的 macOS 系統：

def test():
    manager = mp.Manager()
    asyncPlotter = AsyncPlotter(manager.Value('i', 0))
    ...

此外，如您的評論中所述， asyncPlotter在重用時不起作用。 我不知道細節，但看起來它與Value對象如何跨進程共享有關。 test功能需要像：

def test():
    manager = mp.Manager()
    nc = manager.Value('i', 0)

    asyncPlotter1 = AsyncPlotter(nc)
    asyncPlotter1.saveLinePlotVec3("test 1")
    asyncPlotter2 = AsyncPlotter(nc)
    asyncPlotter2.saveLinePlotVec3("test 2")

    asyncPlotter1.join()
    asyncPlotter2.join()

總而言之，您可能希望重構代碼並使用進程池。 它已經通過cpu_count和並行執行處理了AsyncPlotter正在做的事情：

from multiprocessing import Pool, set_start_method
from random import random
import time

def linePlotVec3(test):
    time.sleep(random())
    print("test", test)

if __name__ == "__main__":
    set_start_method("spawn")
    with Pool() as pool:
        pool.map(linePlotVec3, range(20))

或者您可以使用ProcessPoolExecutor來做幾乎相同的事情。 此示例一次啟動一項任務，而不是映射到列表：

from concurrent.futures import ProcessPoolExecutor
import multiprocessing as mp
import time
from random import random

def work(i):
    r = random()
    print("work", i, r)
    time.sleep(r)

def main():
    ctx = mp.get_context("spawn")
    with ProcessPoolExecutor(mp_context=ctx) as pool:
        for i in range(20):
            pool.submit(work, i)

if __name__ == "__main__":
    main()

Answer 2

為了可移植性，作為參數傳遞給將在進程中運行的函數的所有對象都必須是可picklable的。

使用啟動方法“spawn”的 Python 多處理不起作用

問題描述

2 個解決方案

解決方案1
3 已采納 2019-07-25 00:43:33

解決方案2
1 2019-07-24 22:00:31

使用啟動方法“spawn”的 Python 多處理不起作用

問題描述

2 個解決方案

解決方案1 3 已采納 2019-07-25 00:43:33

解決方案2 1 2019-07-24 22:00:31

解決方案1
3 已采納 2019-07-25 00:43:33

解決方案2
1 2019-07-24 22:00:31