[英]How to make the python code with two for loop run faster(Is there a python way of doing Mathematica's Parallelize)?
I am completely new to python or any such programming language. 我对python或任何此类编程语言是全新的。 I have some experience with Mathematica.
我对Mathematica有一些经验。 I have a mathematical problem which though Mathematica solves with her own 'Parallelize' methods but leaves the system quite exhausted after using all the cores!
我有一个数学问题,尽管Mathematica用她自己的“并行化”方法解决了问题,但是在使用了所有内核之后,系统就变得筋疲力尽了! I can barely use the machine during the run.
在跑步过程中,我几乎无法使用机器。 Hence, I was looking for some coding alternative and found python kind of easy to learn and implement.
因此,我一直在寻找一些编码替代方案,并发现python易于学习和实现。 So without further ado, let me tell you the mathematical problem and issues with my python code.
因此,事不宜迟,让我告诉您数学问题以及我的python代码问题。 As the full code is too long, let me give an outline.
由于完整的代码太长,让我概述一下。
1. Numericall solve a differential equation of the form y''(t) + f(t)y(t)=0, to get y(t) for some range, say C <= t <= D 1.数值求解形式为y''(t)+ f(t)y(t)= 0的微分方程,以在一定范围内得出y(t),例如C <= t <= D
2.Next, Interpolate the numerical result for some desired range to get the function: w(t), say for A <= t <= B 2.接下来,对某个所需范围的数值结果进行插值以获得函数:w(t),例如,对于A <= t <= B
3. Using w(t), to solve another differential equation of the form z''(t) + [ a + b W(t)] z(t) =0 for some range of a and b, for which I am using the loop. 3.使用w(t),求解a和b的某个范围内的另一个形式为z''(t)+ [a + b W(t)] z(t)= 0的微分方程,使用循环。
4. Deine F = 1 + sol1[157], to make a list like {a, b, F} . 4. Deine F = 1 + sol1 [157],以生成类似{a,b,F}的列表 。 So let me give a prototype loop as this take most of the computation time.
因此,让我给出一个原型循环,因为这需要花费大量的计算时间。
for q in np.linspace(0.0, 4.0, 100):
for a in np.linspace(-2.0, 7.0, 100):
print('Solving for q = {}, a = {}'.format(q,a))
sol1 = odeint(fun, [1, 0], t, args=( a, q))[..., 0]
print(t[157])
F = 1 + sol1[157]
f1.write("{} {} {} \n".format(q, a, F))
f1.close()
Now, the real loop takes about 4 hrs and 30 minutes to complete (With some built-in functional form of w(t), it takes about 2 minute). 现在,完成真正的循环大约需要4小时30分钟(使用w(t)的某些内置函数形式,大约需要2分钟)。 When, I applied (without properly understanding what it does and how!) numba/autojit before the definition of fun in my code, the run time significantly improved and takes about 2 hrs and 30 minute.
何时,我在代码中定义fun之前应用了numba / autojit (没有正确地理解它的作用和方式!),运行时间显着改善,大约需要2个小时30分钟。 Also, writing two loops as itertools/product further reduces the run time by about 2 minutes only!
此外,将两个循环作为itertools / product编写还可将运行时间仅减少约2分钟! However, Mathematica, when I let her use all the 4 cores, finishes the task within 30 minutes.
但是,当我让她使用全部4个核心时,Mathematica会在30分钟内完成任务。
So, is there a way to improve the runtime in python? 那么,有没有办法改善python中的运行时?
To speed up python, you have three options: 为了加快python的运行速度,您可以使用以下三种选择:
Implementing multiprocessing - example using the prototype loop from the original question 实现多重处理-使用原始问题中的原型循环的示例
I assume that the computations you do inside the nested loops in your prototype code are actually independent from one another. 我假设您在原型代码的嵌套循环内执行的计算实际上是彼此独立的。 Since your prototype code is incomplete, I am not sure this is the case, however.
由于您的原型代码不完整,因此我不确定情况是否如此。 Otherwise it will, of course, not work.
否则,它当然不起作用。 I will give an example using not your differential equation problem for the fun function but a prototype of the same signature (input and output variables).
我将给出一个示例,该示例不是将您的微分方程问题用于fun函数,而是将一个具有相同签名(输入和输出变量)的原型使用。
import numpy as np
import scipy.integrate
import multiprocessing as mp
def fun(y, t, b, c):
# replace this function with whatever function you want to work with
# (this one is the example function from the scipy docs for odeint)
theta, omega = y
dydt = [omega, -b*omega - c*np.sin(theta)]
return dydt
#definitions of work thread and write thread functions
def run_thread(input_queue, output_queue):
# run threads will pull tasks from the input_queue, push results into output_queue
while True:
try:
queueitem = input_queue.get(block = False)
if len(queueitem) == 3:
a, q, t = queueitem
sol1 = scipy.integrate.odeint(fun, [1, 0], t, args=( a, q))[..., 0]
F = 1 + sol1[157]
output_queue.put((q, a, F))
except Exception as e:
print(str(e))
print("Queue exhausted, terminating")
break
def write_thread(queue):
# write thread will pull results from output_queue, write them to outputfile.txt
f1 = open("outputfile.txt", "w")
while True:
try:
queueitem = queue.get(block = False)
if queueitem[0] == "TERMINATE":
f1.close()
break
else:
q, a, F = queueitem
print("{} {} {} \n".format(q, a, F))
f1.write("{} {} {} \n".format(q, a, F))
except:
# necessary since it will throw an error whenever output_queue is empty
pass
# define time point sequence
t = np.linspace(0, 10, 201)
# prepare input and output Queues
mpM = mp.Manager()
input_queue = mpM.Queue()
output_queue = mpM.Queue()
# prepare tasks, collect them in input_queue
for q in np.linspace(0.0, 4.0, 100):
for a in np.linspace(-2.0, 7.0, 100):
# Your computations as commented here will now happen in run_threads as defined above and created below
# print('Solving for q = {}, a = {}'.format(q,a))
# sol1 = scipy.integrate.odeint(fun, [1, 0], t, args=( a, q))[..., 0]
# print(t[157])
# F = 1 + sol1[157]
input_tupel = (a, q, t)
input_queue.put(input_tupel)
# create threads
thread_number = mp.cpu_count()
procs_list = [mp.Process(target = run_thread , args = (input_queue, output_queue)) for i in range(thread_number)]
write_proc = mp.Process(target = write_thread, args = (output_queue,))
# start threads
for proc in procs_list:
proc.start()
write_proc.start()
# wait for run_threads to finish
for proc in procs_list:
proc.join()
# terminate write_thread
output_queue.put(("TERMINATE",))
write_proc.join()
Explanation 说明
run_thread
) that is run in the threads. run_thread
)。 This function computes individual problems until there are none left in the input Queue; write_thread
) for collecting the results from the output queue and writing them into a file. write_thread
),用于从输出队列中收集结果并将其写入文件。 Caveats 注意事项
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.