Python：以不同的執行時間同時運行多個函數

Question

我正在處理一個需要運行兩個不同 CPU 密集型功能的項目。 因此，使用多進程方法似乎是要走的路。 我面臨的挑戰是一個函數的運行時間比另一個慢。 為了論證，我們假設execute的運行時間為 0.1 秒，而update需要整整一秒才能運行。 目標是在update運行時execute將計算輸出值 10 次。 update完成后，它需要傳遞一組參數來execute ，然后可以使用新的一組參數繼續生成輸出。 一段時間后update需要再次運行並再次生成一組新參數。

此外，這兩個函數都需要一組不同的輸入變量。

下面的圖片鏈接應該希望能更好地形象化我的難題。

函數運行時可視化

從我收集到的（ https://zetcode.com/python/multiprocessing/ ）使用非對稱映射方法可能是要走的路，但它似乎並沒有真正起作用。 任何幫助是極大的贊賞。

偽代碼

from multiprocessing import Pool
from datetime import datetime
import time
import numpy as np


class MyClass():
    def __init__(self, inital_parameter_1, inital_parameter_2):
        self.parameter_1 = inital_parameter_1
        self.parameter_2 = inital_parameter_2

    def execute(self, input_1, input_2, time_in):
        print('starting execute function for time:' + str(time_in))
        time.sleep(0.1)  # wait for 100 milliseconds
        # generate some output
        output = (self.parameter_1 * input_1) + (self.parameter_2 + input_2)
        print('exiting execute function')
        return output

    def update(self, update_input_1, update_input_2, time_in):
        print('starting update function for time:' + str(time_in))
        time.sleep(1)  # wait for 1 second
        # generate parameters
        self.parameter_1 += update_input_1
        self.parameter_2 += update_input_2
        print('exiting update function')

    def smap(f):
        return f()


if __name__ == "__main__":
    update_input_1 = 3
    update_input_2 = 4
    input_1 = 0
    input_2 = 1
    # initialize class
    my_class = MyClass(1, 2)

    # total runtime (arbitrary)
    runtime = int(10e6)
    # update_time (arbitrary)
    update_time = np.array([10, 10e2, 15e4, 20e5])

    for current_time in range(runtime):
        # if time equals update time run both functions simultanously until update is complete
        if any(update_time == current_time):
            with Pool() as pool:
                res = pool.map_async(my_class.smap, [my_class.execute(input_1, input_2, current_time),
                                                     my_class.update(update_input_1, update_input_2, current_time)])
        # otherwise run only execute
        else:
            output = my_class.execute(input_1, input_2,current_time)
        
        # increment input 
        input_1 += 1
        input_2 += 2

Answer 1

我承認無法完全按照您的代碼與您的描述進行比較。 但我看到了一些問題：

方法update不返回除None之外的任何值，由於缺少return語句而隱式返回。
您的with Pool() ...:塊將在塊退出時調用terminate ，即在您調用pool.map_async之后立即調用，這是非阻塞的。 但是你沒有規定等待這個提交的任務完成（ terminate很可能會在它完成之前terminate正在運行的任務）。
您傳遞給map_async函數的是工作函數名稱和一個可迭代的。 但是您正在調用方法調用以從當前主進程execute和update ，並將它們的返回值用作可迭代的元素，並且這些返回值絕對不是適合傳遞給smap函數。 所以沒有進行多處理，這完全是錯誤的。
您還一遍又一遍地創建和銷毀進程池。 只創建一次進程池要好得多。

因此，我建議至少進行以下更改。 但請注意，此代碼生成任務的速度可能比完成任務的速度要快得多，並且在給定當前runtime時值的情況下，您可能有數百萬個任務排隊等待runtime ，這可能會對內存等系統資源造成很大壓力。 所以我插入了一些代碼來確保提交任務的速度受到限制，以便未完成提交的任務數量永遠不會超過可用 CPU 內核數量的三倍。

# we won't need heavy-duty numpy for what we are doing:
#import numpy as np
from multiprocessing import cpu_count
from threading import Lock
... # etc.

if __name__ == "__main__":
    update_input_1 = 3
    update_input_2 = 4
    input_1 = 0
    input_2 = 1
    # initialize class
    my_class = MyClass(1, 2)

    # total runtime (arbitrary)
    runtime = int(10e6)
    # update_time (arbitrary)
    # we don't need overhead of numpy (remove import of numpy):
    #update_time = np.array([10, 10e2, 15e4, 20e5])
    update_time = [10, 10e2, 15e4, 20e5]

    tasks_submitted = 0
    lock = Lock()

    execute_output = []
    def execute_result(result):
        global tasks_submitted

        with lock:
            tasks_submitted -= 1
        # result is the return value from method execute
        # do something with it, e.g. execute_output.append(result)
        pass

    update_output = []
    def update_result(result):
        global tasks_submitted

        with lock:
            tasks_submitted -= 1
        # result is the return value from method update
        # do something with it, e.g. update_output.append(result)
        pass

    n_processors = cpu_count()
    with Pool() as pool:
        for current_time in range(runtime):
            # if time equals update time run both functions simultanously until update is complete
            #if any(update_time == current_time):
            if current_time in update_time:
                # run both update and execute:
                pool.apply_async(my_class.update, args=(update_input_1, update_input_2, current_time), callback=update_result)
                with lock:
                    tasks_submitted += 1
            pool.apply_async(my_class.execute, args=(input_1, input_2, current_time), callback=execute_result)
            with lock:
                tasks_submitted += 1

            # increment input
            input_1 += 1
            input_2 += 2
            while tasks_submitted > n_processors * 3:
                time.sleep(.05)
        # Ensure all tasks have completed:
        pool.close()
        pool.join()
        assert(tasks_submitted == 0)

Python：以不同的執行時間同時運行多個函數

問題描述

1 個解決方案

解決方案1
0 2021-07-11 14:40:06

Python：以不同的執行時間同時運行多個函數

問題描述

1 個解決方案

解決方案1 0 2021-07-11 14:40:06

解決方案1
0 2021-07-11 14:40:06