[英]Python run function parallel
I want to run a "main"-function for n times. 我想运行一个“主要”功能n次。 This function starts other functions when it is running. 该功能在运行时会启动其他功能。 The "main"-function is called "repeat" and when it is running it first starts the function "copula_sim" and from there I get an output which is called "total_summe_liste". “主要”功能称为“重复”,运行时首先启动功能“ copula_sim”,然后从那里获得名为“ total_summe_liste”的输出。 This list will be added to "mega_summe_list" which safes all outputs from the n runs. 该列表将被添加到“ mega_summe_list”中,以确保n次运行的所有输出的安全。 The sorted "total_summe_liste" will be safed as " RM_list" which is the input for the functions "VaR_func", "CVaR_func" and "power_func" which all generate an output which is sorted in the specific list "RM_VaR_list", "RM_CVaR_list" or "RM_PSRM_list". 排序后的“ total_summe_liste”将安全地存储为“ RM_list”,这是函数“ VaR_func”,“ CVaR_func”和“ power_func”的输入,它们均会生成输出,并按特定列表“ RM_VaR_list”,“ RM_CVaR_list”或“ RM_PSRM_list”。 After that "RM_list" and "total_summe_liste" will be cleared before the next run begins. 之后,将清除“ RM_list”和“ total_summe_liste”,然后开始下一次运行。
In the end I got "mega_summe_list", "RM_VaR_list", "RM_CVaR_list" and "RM_PSRM_list" which will be used to generate an plot and a dataframe. 最后,我得到了“ mega_summe_list”,“ RM_VaR_list”,“ RM_CVaR_list”和“ RM_PSRM_list”,它们将用于生成图形和数据框。
Now I want to run the "repeat"-function parallel. 现在,我要并行运行“重复”功能。 For example when I want to run this function n=10 times I want to run it on 10 cpu cores at the same time. 例如,当我要运行此函数n = 10次时,我想同时在10个CPU内核上运行它。 The reason is that "copula_sim" is a monte-carlo-simulation which take a while when I make a big simulation. 原因是“ copula_sim”是一个蒙特卡罗模拟,当我进行大型模拟时会花费一些时间。
What I have is this: 我所拥有的是:
total_summe_liste = []
RM_VaR_list = []
RM_CVaR_list = []
RM_PSRM_list = []
mega_summe_list = []
def repeat():
global RM_list
global total_summe_liste
global RM_VaR_list
global RM_CVaR_list
global RM_PSRM_list
global mega_summe_list
copula_sim(runs_sim, rand_x, rand_y, mu, full_log=False)
mega_summe_list += total_summe_liste
RM_list = sorted(total_summe_liste)
VaR_func(alpha)
RM_VaR_list.append(VaR)
CVaR_func(alpha)
RM_CVaR_list.append(CVaR)
power_func(gamma)
RM_PSRM_list.append(risk)
RM_list = []
total_summe_liste = []
n = 10
for i in range(0,n):
repeat()
which is working so far. 到目前为止,它一直在工作。
I tryed: 我尝试过:
if __name__ == '__main__':
jobs = []
for i in range(0,10):
p = mp.Process(target=repeat)
jobs.append(p)
p.start()
But when I run this the "mega_summe_list" is empty.. When I add "print(VaR) to repeat then it shows me all 10 VaR when its done. So the parallel task is working so far. 但是,当我运行此命令时,“ mega_summe_list”为空。当我添加“ print(VaR)重复一次时,它会显示完成后的所有10 VaR。因此,并行任务到目前为止已经可以正常工作了。
What is the problem? 问题是什么?
The reason for this issue is because, the list mega_summe_list
is not shared between the processes. 出现此问题的原因是,列表mega_summe_list
在进程之间不共享。
When you invoke parallel processing in python all the functions and variables are imported and run independently in different processes. 当您在python中调用并行处理时,所有函数和变量都将导入并在不同进程中独立运行。
So, for instance when you start 5 processes, 5 different copies of these variables are imported and run independently. 因此,例如,当您启动5个进程时,这些变量的5个不同副本将被导入并独立运行。 So, when you access mega_summe_list
in main it is still empty, because it is empty in this process. 因此,当您在main中访问mega_summe_list
,它仍然为空,因为在此过程中它为空。
To enable synchronization between processes, you can use a list proxy from the multiprocessing package. 要启用进程之间的同步,可以使用多处理程序包中的列表代理。 A Multiprocessing manager maintains an independent server process where in these python objects are held. 多处理管理器维护一个独立的服务器进程,这些Python对象存放在该服务器进程中。
Below is the code used to create a multiprocessing Manager List, 以下是用于创建多处理管理器列表的代码,
from multiprocessing import Manager
mega_summe_list = Manager().List()
Above code can be used instead of mega_summe_list = []
while using multiprocessing. 使用多处理时,可以使用以上代码代替mega_summe_list = []
。
Below is an example, 下面是一个例子
from multiprocessing.pool import Pool
from multiprocessing import Manager
def repeat_test(_):
global b, mp_list
a = [1,2,3]
b += a
mp_list += a # Multiprocessing Manager List
a = []
if __name__ == "__main__":
b = []
mp_list = Manager().list()
p = Pool(5)
p.map(repeat_test, range(5))
print("a: {0}, \n mp_list: {1}".format(b, mp_list))
Output: 输出:
b: [],
mp_list: [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]
Hope this solves your problem. 希望这能解决您的问题。
You should use the Multiprocessing Pool , then you can do something like: 您应该使用Multiprocessing Pool ,然后可以执行以下操作:
p = Pool(10)
p.map(repeat, range(10))
I solved the problem this way: 我这样解决了这个问题:
This function is the function I want to repeat n times in parallel way: 此函数是我想以并行方式重复n次的函数:
from multiprocessing import Process
from multiprocessing import Manager
from multiprocessing.pool import Pool
def repeat(shared_list, VaR_list, CVaR_list, PSRM_list, i):
global RM_list
global total_summe_liste
copula_sim(runs_sim, rand_x, rand_y, mu, full_log=False)
shared_list += total_summe_liste
RM_list = sorted(total_summe_liste)
VaR_func(alpha)
VaR_list.append(VaR)
CVaR_func(alpha)
CVaR_list.append(CVaR)
power_func(gamma)
PSRM_list.append(risk)
RM_list = []
total_summe_liste = []
This part manages the shared lists and do the paralleling stuff. 这部分管理共享列表并执行并行处理。 Thanks @noufel13! 谢谢@ noufel13!
RM_VaR_list = []
RM_CVaR_list = []
RM_PSRM_list = []
mega_summe_list = []
if __name__ == "__main__":
with Manager() as manager:
shared_list = manager.list()
VaR_list = manager.list()
CVaR_list = manager.list()
PSRM_list = manager.list()
processes = []
for i in range(12):
p = Process(target=repeat, args=(shared_list, VaR_list, CVaR_list, PSRM_list, i)) # Passing the list
p.start()
processes.append(p)
for p in processes:
p.join()
RM_VaR_list += VaR_list
RM_CVaR_list += CVaR_list
RM_PSRM_list += PSRM_list
mega_summe_list += shared_list
RM_frame_func()
plotty_func()
Thank you! 谢谢!
The only question left is how I handle big arrays? 剩下的唯一问题是我如何处理大数组? Is there a way to do this morr efficiently? 有没有办法有效地做到这一点? One of the 12 shared lists can have more than 100.000.000 items so in total the mega_summe_list has got about 1.200.000.000 items... 12个共享列表之一可以包含超过100.000.000个项目,因此mega_summe_list总共具有约1.200.000.000个项目...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.