簡體   English   中英

Python:如何在python中運行嵌套並行進程?

[英]Python: How to run nested parallel process in python?

我有交易者交易的數據集df 我有2個級別的for循環,如下所示:

smartTrader =[]

for asset in range(len(Assets)):
    df = df[df['Assets'] == asset]
    # I have some more calculations here
    for trader in range(len(df['TraderID'])):
        # I have some calculations here, If trader is successful, I add his ID  
        # to the list as follows
        smartTrader.append(df['TraderID'][trader])

    # some more calculations here which are related to the first for loop.

我想並行化Assets每個資產的計算,並且還希望並行化每個資產的每個交易者的計算。 完成所有這些計算后,我想基於smartTrader列表進行其他分析。

這是我第一次嘗試並行處理,因此請耐心等待,感謝您的幫助。

如果使用pathos (它提供了multiprocessing的分支),則可以輕松嵌套並行映射。 pathos是為輕松測試嵌套並行映射的組合而構建的-嵌套並行映射是嵌套for循環的直接轉換。 它提供了阻塞,非阻塞,迭代,異步,串行,並行和分布式的映射選擇。

>>> from pathos.pools import ProcessPool, ThreadPool
>>> amap = ProcessPool().amap
>>> tmap = ThreadPool().map
>>> from math import sin, cos
>>> print amap(tmap, [sin,cos], [range(10),range(10)]).get()
[[0.0, 0.8414709848078965, 0.9092974268256817, 0.1411200080598672, -0.7568024953079282, -0.9589242746631385, -0.27941549819892586, 0.6569865987187891, 0.9893582466233818, 0.4121184852417566], [1.0, 0.5403023058681398, -0.4161468365471424, -0.9899924966004454, -0.6536436208636119, 0.2836621854632263, 0.9601702866503661, 0.7539022543433046, -0.14550003380861354, -0.9111302618846769]]

在此示例中,使用了一個處理池和一個線程池,其中線程映射調用處於阻塞狀態,而處理映射調用是異步的(請注意最后一行的get )。

獲取pathos這里: https://github.com/uqfoundation或: $ pip install git+https://github.com/uqfoundation/pathos.git@master

可以使用Ray優雅地完成嵌套並行處理,該系統使您可以輕松地並行化和分發Python代碼。

假設您要並行化以下嵌套程序

def inner_calculation(asset, trader):
    return trader

def outer_calculation(asset):
    return  asset, [inner_calculation(asset, trader) for trader in range(5)]

inner_results = []
outer_results = []

for asset in range(10):
    outer_result, inner_result = outer_calculation(asset)
    outer_results.append(outer_result)
    inner_results.append(inner_result)

# Then you can filter inner_results to get the final output.

波紋管是將以上代碼並行化的Ray代碼:

  • 對於要在其進程中同時執行的每個函數,請使用@ray.remote decorator 遠程功能返回的是Future(即結果的標識符),而不是結果本身。
  • 調用遠程函數f()remote修飾符即f.remote()
  • 使用ids_to_vals()幫助函數將嵌套的id列表轉換為值。

注意程序結構是相同的。 您只需添加remote ,然后使用ids_to_vals()幫助函數將遠程函數返回的期貨(id)轉換為值。

import ray

ray.init()

# Define inner calculation as a remote function.
@ray.remote
def inner_calculation(asset, trader):
    return trader

# Define outer calculation to be executed as a remote function.
@ray.remote(num_return_vals = 2)
def outer_calculation(asset):
    return  asset, [inner_calculation.remote(asset, trader) for trader in range(5)]

# Helper to convert a nested list of object ids to a nested list of corresponding objects.
def ids_to_vals(ids):
    if isinstance(ids, ray.ObjectID):
        ids = ray.get(ids)
    if isinstance(ids, ray.ObjectID):
        return ids_to_vals(ids)
    if isinstance(ids, list):
        results = []
        for id in ids:
            results.append(ids_to_vals(id))
        return results
    return ids

outer_result_ids = []
inner_result_ids = []

for asset in range(10):
    outer_result_id, inner_result_id = outer_calculation.remote(asset)
    outer_result_ids.append(outer_result_id)
    inner_result_ids.append(inner_result_id)

outer_results = ids_to_vals(outer_result_ids)
inner_results = ids_to_vals(inner_result_ids)

多處理模塊相比,使用Ray有許多優點。 特別是, 相同的代碼將在單台計算機以及一台計算機集群上運行。 有關Ray的更多優點,請參見此相關文章

來自標准python庫的線程大概是最方便的方法:

import threading

def worker(id):
    #Do you calculations here
    return

threads = []
for asset in range(len(Assets)):
    df = df[df['Assets'] == asset]
    for trader in range(len(df['TraderID'])):
        t = threading.Thread(target=worker, args=(trader,))
        threads.append(t)
        t.start()
    #add semaphore here if you need synchronize results for all traders.

代替使用for ,使用map

import functools
smartTrader =[]

m=map( calculations_as_a_function, 
        [df[df['Assets'] == asset] \
                for asset in range(len(Assets))])
functools.reduce(smartTradder.append, m)

從那時起,您可以嘗試不同的並行map實現,例如multiprocessingstackless

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM