Python 多處理 pool.map 在 sklearn 函數調用后掛起

Question

我正在嘗試使用multiprocessing在二維數組和二維數組集合之間執行一些計算。 假設我有一個矩陣mat1和一個矩陣集合test ，我想在其中計算mat1和test元素之間的所有矩陣乘法。 我使用多處理並行運行計算，因為test的大小非常大。 但是，我注意到即使是一個小test ，計算也永遠不會完成。 具體來說，該程序似乎永遠不會完成矩陣乘法計算。 似乎是對特定sklearn函數的調用導致了該問題。 我編寫了以下代碼來說明這一點（我使用partial而不是starmap因為我想稍后使用imap和tqdm ）：

from multiprocessing import Pool
from functools import partial
import numpy as np
import sklearn as sklearn

def bar(y, x):

    # this does not seem to complete
    mul = x @ y.T

    # so this does not print
    print('done')

    return mul

def foo():

    mat1 = np.ones((1000000, 14))
    test = (np.ones((1,14)), np.ones((1,14)))

    # these will finish
    print(mat1 @ test[0].T)
    print(mat1 @ test[1].T)

    with Pool(6) as pool:
        result = pool.map(partial(bar, x=mat1), test
        p.close()
        p.join()

if __name__ == "__main__":

    # Causes the hang
    sklearn.metrics.pairwise.rbf_kernel(np.ones((9000, 14)), 
                                        np.ones((9000, 14)))

    foo()

注意：對於那些不熟悉partial ，這是來自文檔：

functools.partial(func[,*args][, **keywords])

返回一個新的部分對象，當調用該對象時，其行為類似於使用位置參數 args 和關鍵字參數關鍵字調用的 func 。

我被迫手動停止執行，否則它將永遠運行。 我沒有正確使用multiprocessing嗎？

對於那些感興趣的人，可以在下面找到強制停止后的完整回溯：

--------------------------------------------------------------------------- KeyboardInterrupt                         Traceback (most recent call last) <ipython-input-18-6c073b574e37> in <module>
      8     
      9     sklearn.metrics.pairwise.rbf_kernel(np.ones((9000, 14)), np.ones((9000, 14)))
---> 10     foo()
     11 

<ipython-input-17-d183fc19ae3c> in foo()
     11     with Pool(6) as pool:
     12     # this will not finish
---> 13         result = pool.map(partial(bar, x=mat1), test)
     14         p.close()
     15         p.join()

~/anaconda3/lib/python3.7/multiprocessing/pool.py in map(self, func, iterable, chunksize)
    266         in a list that is returned.
    267         '''
--> 268         return self._map_async(func, iterable, mapstar, chunksize).get()
    269 
    270     def starmap(self, func, iterable, chunksize=None):

~/anaconda3/lib/python3.7/multiprocessing/pool.py in get(self, timeout)
    649 
    650     def get(self, timeout=None):
--> 651         self.wait(timeout)
    652         if not self.ready():
    653             raise TimeoutError

~/anaconda3/lib/python3.7/multiprocessing/pool.py in wait(self, timeout)
    646 
    647     def wait(self, timeout=None):
--> 648         self._event.wait(timeout)
    649 
    650     def get(self, timeout=None):

~/anaconda3/lib/python3.7/threading.py in wait(self, timeout)
    550             signaled = self._flag
    551             if not signaled:
--> 552                 signaled = self._cond.wait(timeout)
    553             return signaled
    554 

~/anaconda3/lib/python3.7/threading.py in wait(self, timeout)
    294         try:    # restore state no matter what (e.g., KeyboardInterrupt)
    295             if timeout is None:
--> 296                 waiter.acquire()
    297                 gotit = True
    298             else:

KeyboardInterrupt:

更新1：

經過更多的調試，我發現了一些奇怪的東西。 實現sokato的代碼后，我設法修復了這個例子。 但是，在main() foo()之前調用以下sklearn函數時，我可以再次觸發該問題：

sklearn.metrics.pairwise.rbf_kernel(np.ones((9000, 14)), np.ones((9000, 14)))

我已經更新了原始帖子以反映這一點。

Answer 1

您需要關閉多處理池。 例如

def bar(y, x):

    # this does not seem to complete
    mul = x @ y.T

    # so this does not print
    print('done')

    return mul

def foo():

    mat1 = np.ones((1000000, 14))
    test = (np.ones((1,14)), np.ones((1,14)))

    with Pool(5) as p:
    # this will not finish
        result = p.map(partial(bar, x=mat1), test)
        p.close()

if __name__ == "__main__":

    foo()

為了適合您的確切語法，您可以這樣做

    pool = Pool(6)
    result = pool.map(partial(bar, x=mat1), test)
    pool.close()

如果您有興趣了解更多信息，我鼓勵您查看文檔。 https://docs.python.org/3.4/library/multiprocessing.html?highlight=process#multiprocessing.pool.Pool

Python 多處理 pool.map 在 sklearn 函數調用后掛起

問題描述

1 個解決方案

解決方案1
0 已采納 2020-03-27 02:50:10

Python 多處理 pool.map 在 sklearn 函數調用后掛起

問題描述

1 個解決方案

解決方案1 0 已采納 2020-03-27 02:50:10

解決方案1
0 已采納 2020-03-27 02:50:10