具有單個函數的 Python 多處理

Question

我有一個當前正在運行的模擬，但預計到達時間約為 40 小時——我正在嘗試通過多處理來加速它。

它本質上迭代了一個變量 (L) 的 3 個值，以及第二個變量 (a) 的 99 個值。 使用這些值，它基本上運行復雜的模擬並返回 9 個不同的標准偏差。 因此（即使我還沒有那樣編碼）它本質上是一個函數，它接受兩個值作為輸入 (L,a) 並返回 9 個值。

這是我擁有的代碼的本質：

STD_1 = []
STD_2 = []
# etc.

for L in range(0,6,2):
    for a in range(1,100):
        ### simulation code ###
        STD_1.append(value_1)
        STD_2.append(value_2)
        # etc.

這是我可以修改的內容：

master_list = []

def simulate(a,L):
    ### simulation code ###
    return (a,L,STD_1, STD_2 etc.)

for L in range(0,6,2):
    for a in range(1,100): 
        master_list.append(simulate(a,L))

由於每個模擬都是獨立的，因此它似乎是實現某種多線程/處理的理想場所。

我究竟將如何進行編碼？

編輯：此外，所有內容是否會按順序返回到主列表，或者如果多個進程都在工作，它可能會亂序嗎？

編輯 2：這是我的代碼——但它沒有正確運行。 它詢問我是否要在運行后立即終止該程序。

import multiprocessing

data = []

for L in range(0,6,2):
    for a in range(1,100):
        data.append((L,a))

print (data)

def simulation(arg):
    # unpack the tuple
    a = arg[1]
    L = arg[0]
    STD_1 = a**2
    STD_2 = a**3
    STD_3 = a**4
    # simulation code #
    return((STD_1,STD_2,STD_3))

print("1")

p = multiprocessing.Pool()

print ("2")

results = p.map(simulation, data)

編輯 3：還有什么是多處理的限制。 我聽說它在 OS X 上不起作用。這是正確的嗎？

Answer 1

將每次迭代的數據包裝成一個元組。
制作這些元組的列表data
寫一個函數f來處理一個元組並返回一個結果
創建p = multiprocessing.Pool()對象。
調用results = p.map(f, data)

這將運行盡可能多的f實例，因為您的機器在不同的進程中有內核。

編輯1：示例：

from multiprocessing import Pool

data = [('bla', 1, 3, 7), ('spam', 12, 4, 8), ('eggs', 17, 1, 3)]

def f(t):
    name, a, b, c = t
    return (name, a + b + c)

p = Pool()
results = p.map(f, data)
print results

編輯2：

多處理在類 UNIX 平台（如 OSX）上應該可以正常工作。 只有缺少os.fork平台（主要是 MS Windows）才需要特別注意。 但即使在那里它仍然有效。 請參閱多處理文檔。

Answer 2

如果排序不重要，請使用Pool().imap_unordered 。 它將以非阻塞方式返回結果。

Answer 3

這是在並行線程中運行它的一種方法：

import threading

L_a = []

for L in range(0,6,2):
    for a in range(1,100):
        L_a.append((L,a))
        # Add the rest of your objects here

def RunParallelThreads():
    # Create an index list
    indexes = range(0,len(L_a))
    # Create the output list
    output = [None for i in indexes]
    # Create all the parallel threads
    threads = [threading.Thread(target=simulate,args=(output,i)) for i in indexes]
    # Start all the parallel threads
    for thread in threads: thread.start()
    # Wait for all the parallel threads to complete
    for thread in threads: thread.join()
    # Return the output list
    return output

def simulate(list,index):
    (L,a) = L_a[index]
    list[index] = (a,L) # Add the rest of your objects here

master_list = RunParallelThreads()

具有單個函數的 Python 多處理

問題描述

3 個解決方案

解決方案1
2 2014-02-22 19:40:42

解決方案2
0 2021-03-01 13:37:57

解決方案3
-1 2014-02-22 20:09:09

具有單個函數的 Python 多處理

問題描述

3 個解決方案

解決方案1 2 2014-02-22 19:40:42

解決方案2 0 2021-03-01 13:37:57

解決方案3 -1 2014-02-22 20:09:09

解決方案1
2 2014-02-22 19:40:42

解決方案2
0 2021-03-01 13:37:57

解決方案3
-1 2014-02-22 20:09:09