I have a simulation that is currently running, but the ETA is about 40 hours -- I'm trying to speed it up with multi-processing.
It essentially iterates over 3 values of one variable (L), and over 99 values of of a second variable (a). Using these values, it essentially runs a complex simulation and returns 9 different standard deviations. Thus (even though I haven't coded it that way yet) it is essentially a function that takes two values as inputs (L,a) and returns 9 values.
Here is the essence of the code I have:
STD_1 = []
STD_2 = []
# etc.
for L in range(0,6,2):
for a in range(1,100):
### simulation code ###
STD_1.append(value_1)
STD_2.append(value_2)
# etc.
Here is what I can modify it to:
master_list = []
def simulate(a,L):
### simulation code ###
return (a,L,STD_1, STD_2 etc.)
for L in range(0,6,2):
for a in range(1,100):
master_list.append(simulate(a,L))
Since each of the simulations are independent, it seems like an ideal place to implement some sort of multi-threading/processing.
How exactly would I go about coding this?
EDIT: Also, will everything be returned to the master list in order, or could it possibly be out of order if multiple processes are working?
EDIT 2: This is my code -- but it doesn't run correctly. It asks if I want to kill the program right after I run it.
import multiprocessing
data = []
for L in range(0,6,2):
for a in range(1,100):
data.append((L,a))
print (data)
def simulation(arg):
# unpack the tuple
a = arg[1]
L = arg[0]
STD_1 = a**2
STD_2 = a**3
STD_3 = a**4
# simulation code #
return((STD_1,STD_2,STD_3))
print("1")
p = multiprocessing.Pool()
print ("2")
results = p.map(simulation, data)
EDIT 3: Also what are the limitations of multiprocessing. I've heard that it doesn't work on OS X. Is this correct?
data
of those tuplesf
to process one tuple and return one resultp = multiprocessing.Pool()
object.results = p.map(f, data)
This will run as many instances of f
as your machine has cores in separate processes.
Edit1: Example:
from multiprocessing import Pool
data = [('bla', 1, 3, 7), ('spam', 12, 4, 8), ('eggs', 17, 1, 3)]
def f(t):
name, a, b, c = t
return (name, a + b + c)
p = Pool()
results = p.map(f, data)
print results
Edit2:
Multiprocessing should work fine on UNIX-like platforms such as OSX. Only platforms that lack os.fork
(mainly MS Windows) need special attention. But even there it still works. See the multiprocessing documentation.
Use Pool().imap_unordered
if ordering is not important. It will return results in a non-blocking fashion.
Here is one way to run it in parallel threads:
import threading
L_a = []
for L in range(0,6,2):
for a in range(1,100):
L_a.append((L,a))
# Add the rest of your objects here
def RunParallelThreads():
# Create an index list
indexes = range(0,len(L_a))
# Create the output list
output = [None for i in indexes]
# Create all the parallel threads
threads = [threading.Thread(target=simulate,args=(output,i)) for i in indexes]
# Start all the parallel threads
for thread in threads: thread.start()
# Wait for all the parallel threads to complete
for thread in threads: thread.join()
# Return the output list
return output
def simulate(list,index):
(L,a) = L_a[index]
list[index] = (a,L) # Add the rest of your objects here
master_list = RunParallelThreads()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.