如何在python中使用多處理

Question

我是python的新手，我想在下面的代碼中進行並行編程，並希望在python中使用多處理來完成它。 那么如何修改代碼呢？ 我一直在使用Pool搜索方法，但發現了我可以遵循的有限示例。 有人可以幫幫我嗎？ 謝謝。

請注意，setinner和setouter是兩個獨立的函數，我想使用並行編程來減少運行時間。

def solve(Q,G,n):
    i = 0
    tol = 10**-4

    while i < 1000:

        inneropt,partition,x = setinner(Q,G,n)
        outeropt = setouter(Q,G,n)

        if (outeropt - inneropt)/(1 + abs(outeropt) + abs(inneropt)) < tol:
            break

        node1 = partition[0]
        node2 = partition[1]

        G = updateGraph(G,node1,node2)
        if i == 999:
            print "Maximum iteration reaches"
    print inneropt

Answer 1

很難並行化需要改變來自不同任務的相同共享數據的代碼。 所以，我將假設setinner和setouter是非變異函數; 如果那不是真的，情況會更復雜。

第一步是決定你想要並行做什么。

一個顯而易見的事情是同時做setinner和setouter 。 他們完全相互獨立，總是需要完成。 所以，這就是我要做的。 而不是這樣做：

inneropt,partition,x = setinner(Q,G,n)
outeropt = setouter(Q,G,n)

...我們希望將兩個函數作為任務提交給池，然后等待兩者完成，然后獲取兩者的結果。

concurrent.futures模塊（需要Python 2.x中的第三方反向端口）使得比multiprocessing模塊（在2.6+中的stdlib中）更容易做“等待兩者都完成”之類的事情，但是在這種情況下，我們不需要任何花哨的東西; 如果其中一個提前完成，我們無論如何都要做，直到另一個完成。 所以，讓我們堅持使用multiprocessing.apply_async ：

pool = multiprocessing.Pool(2) # we never have more than 2 tasks to run
while i < 1000:
    # parallelly start both tasks
    inner_result = pool.apply_async(setinner, (Q, G, n))
    outer_result = pool.apply_async(setouter, (Q, G, n))

    # sequentially wait for both tasks to finish and get their results
    inneropt,partition,x = inner_result.get()
    outeropt = outer_result.get()

    # the rest of your loop is unchanged

您可能希望將池移動到函數外部，以便它永遠存在，並且可以由代碼的其他部分使用。 如果沒有，你幾乎肯定想在功能結束時關閉池。 （ multiprocessing更高版本允許您在with語句中使用池，但我認為這需要Python 3.2+，因此您必須明確地執行它。）

如果您想要並行完成更多工作怎么辦？ 好吧，沒有重組循環，沒有其他明顯的事情要做。 在從setinner和setouter返回結果之前，你不能執行updateGraph ，這里沒有別的東西。

但是，如果你可以重新組織事物，以便每個循環的setinner獨立於之前的所有內容（使用你的算法可能或不可能 - 不知道你在做什么，我無法猜測），你可以推動2000任務在前面排隊，然后根據需要抓取結果循環。 例如：

pool = multiprocessing.Pool() # let it default to the number of cores
inner_results = []
outer_results = []
for _ in range(1000):
    inner_results.append(pool.apply_async(setinner, (Q,G,n,i))
    outer_results.append(pool.apply_async(setouter, (Q,G,n,i))
while i < 1000:
    inneropt,partition,x = inner_results.pop(0).get()
    outeropt = outer_results.pop(0).get()
    # result of your loop is the same as before

當然，你可以做這個發燒友。

例如，假設您很少需要超過幾百次迭代，因此總是計算1000次迭代是浪費的。 你可以在啟動時按下第一個N，然后每次循環推一個N（或者每N次更多N次），這樣你就不會做多於N浪費的迭代 - 你無法在完美的並行性和最小的並行性之間取得理想的權衡浪費，但你通常可以很好地調整它。

此外，如果任務實際上沒有那么長時間，但你有很多，你可能想要批量處理它們。 一個非常簡單的方法是使用其中一個map變體而不是apply_async ; 這可以使你的代碼取一點點更復雜，但它使排隊和配料代碼完全微不足道的（例如， map每個func超過100個參數用一個列表chunksize的10只是兩個簡單的代碼行）。

如何在python中使用多處理

問題描述

1 個解決方案

解決方案1
1 2013-12-12 23:44:41

如何在python中使用多處理

問題描述

1 個解決方案

解決方案1 1 2013-12-12 23:44:41

解決方案1
1 2013-12-12 23:44:41