[英]Make a function wait for a tkinter root.after() loop to finish before continuing executing
[英]SCOOP - How to make workers wait for root worker before continuing
我在工作中使用 SCOOP(和 Python 3.6 - 無法更新)。 我需要所有工作人員執行計算,然后等待根節點執行緩慢的計算( if __name__ == '__main__':
中的代碼),然后使用根節點計算產生的 dataframe 執行另一次計算。
我的問題是 SCOOP 立即啟動所有工作人員,他們嘗試異步運行if __name__ == '__main__':
之外的所有代碼,即使它在if
塊下方。 由於還沒有 dataframe,他們拋出一個錯誤。
什么命令可以強制所有worker等待root worker完成一次計算,然后再繼續運行代碼rest?
我曾嘗試使用scoop.futures.map
、 scoop.futures.supply
和multiprocessing.managers
進行試驗,但均未成功。 我也嘗試過使用multiprocessing.Barrier(8).wait()
但它不起作用。
有一個scoop.futures.wait(futures)
方法,但我不知道如何獲得futures參數......
我有類似的東西:
import pandas as pd
import genetic_algorithm
from scoop import futures
df = pd.read_csv('database.csv') # dataframe is to large to be passed to fitness_function for every worker. I want every worker to have a copy of it!
if __name__ == '__main__':
df = add_new_columns(df) # heavy computation which I just want to perform once (not by all workers)
df = computation_using_new_columns(df) # <--- !!! error - is executed before slow add_new_columns(df) finishes
def fitness_function(): ... # all workers use fitness_function() and an error is thrown if I put it inside the if __name__ == '__main__':
if __name__ == '__main__':
results = list(futures.map(genetic_algorithm, df))
並使用python3 -m scoop script.py
執行腳本,它會立即啟動所有工作人員...
每個進程都有自己的memory空間,在主進程中修改dataframe不會影響到worker,需要在處理后使用某種初始化器將其傳遞給worker,這在主進程中似乎不可用SCOOP 框架,一個更靈活(但稍微復雜)的工具是 python 的內置multiprocessing.Pool模塊。
import pandas as pd
import genetic_algorithm
from multiprocessing import Pool
def fitness_function(): ...
def initializer_func(df_from_parent):
global df
df = df_from_parent
df = computation_using_new_columns(df)
if __name__ == '__main__':
df = pd.read_csv(
'database.csv')
# read the df in the main process only as it needs to be modified
# before sending it to the workers
df = add_new_columns(df) # modify the df in the main process
# create as much workers as your cpu cores, and passes the df to them, and have each worker
# execute the computation_using_new_columns on it
with Pool(initializer=initializer_func, initargs=(df,)) as pool:
results = list(pool.imap(genetic_algorithm, df)) # now do your function
如果computation_using_new_columns
需要在每個 worker 中執行,那么你可以將它保留在初始化程序中,但如果它只需要執行一次,那么你可以將它放在add_new_columns
之后的if __name__ == "__main__"
中。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.