I'm trying to use the multiprocessing library to parallelize some expensive calculations without blocking some others, much lighter. The both need to interact through some variables, although the may run with different paces.
To show this, I have created the following example, that works fine:
import multiprocessing
import time
import numpy as np
class SumClass:
def __init__(self):
self.result = 0.0
self.p = None
self.return_value = None
def expensive_function(self, new_number, return_value):
# Execute expensive calculation
#######
time.sleep(np.random.random_integers(5, 10, 1))
return_value.value = self.result + new_number
#######
def execute_function(self, new_number):
print(' New number received: %f' % new_number)
self.return_value = multiprocessing.Value("f", 0.0, lock=True)
self.p = multiprocessing.Process(target=self.expensive_function, args=(new_number, self.return_value))
self.p.start()
def is_executing(self):
if self.p is not None:
if not self.p.is_alive():
self.result = self.return_value.value
self.p = None
return False
else:
return True
else:
return False
if __name__ == '__main__':
sum_obj = SumClass()
current_value = 0
while True:
if not sum_obj.is_executing():
# Randomly determine whether the function must be executed or not
if np.random.rand() < 0.25:
print('Current sum value: %f' % sum_obj.result)
new_number = np.random.rand(1)[0]
sum_obj.execute_function(new_number)
# Execute other (light) stuff
#######
print('Executing other stuff')
current_value += sum_obj.result * 0.1
print('Current value: %f' % current_value)
time.sleep(1)
#######
Basically, in the main loop some light function is executed, and depending on a random condition, some heavy work is sent to another process if it has already finished the previous one, carried out by an object which needs to store some data between executions. Although expensive_function needs some time, the light function keeps on executing without being blocked.
Although the above code gets the job done, I'm wondering: is it the best/most appropriate method to do this?
Besides, let us suppose the class SumClass has an instance of another object, which also needs to store data. For example:
import multiprocessing
import time
import numpy as np
class Operator:
def __init__(self):
self.last_value = 1.0
def operate(self, value):
print(' Operation, last value: %f' % self.last_value)
self.last_value *= value
return self.last_value
class SumClass:
def __init__(self):
self.operator_obj = Operator()
self.result = 0.0
self.p = None
self.return_value = None
def expensive_function(self, new_number, return_value):
# Execute expensive calculation
#######
time.sleep(np.random.random_integers(5, 10, 1))
# Apply operation
number = self.operator_obj.operate(new_number)
# Apply other operation
return_value.value = self.result + number
#######
def execute_function(self, new_number):
print(' New number received: %f' % new_number)
self.return_value = multiprocessing.Value("f", 0.0, lock=True)
self.p = multiprocessing.Process(target=self.expensive_function, args=(new_number, self.return_value))
self.p.start()
def is_executing(self):
if self.p is not None:
if not self.p.is_alive():
self.result = self.return_value.value
self.p = None
return False
else:
return True
else:
return False
if __name__ == '__main__':
sum_obj = SumClass()
current_value = 0
while True:
if not sum_obj.is_executing():
# Randomly determine whether the function must be executed or not
if np.random.rand() < 0.25:
print('Current sum value: %f' % sum_obj.result)
new_number = np.random.rand(1)[0]
sum_obj.execute_function(new_number)
# Execute other (light) stuff
#######
print('Executing other stuff')
current_value += sum_obj.result * 0.1
print('Current value: %f' % current_value)
time.sleep(1)
#######
Now, inside the expensive_function , a function member of the object Operator is used, which needs to store the number passed.
As expected, the member variable last_value does not change, ie it does not keep any value.
Is there any way of doing this properly?
I can imagine I could arrange everything so that I only need to use one class level, and it would work well. However, this is a toy example, in reality there are different levels of complex objects and it would be hard.
Thank you very much in advance!
from concurrent.futures import ThreadPoolExecutor
from numba import jit
import requests
import timeit
def timer(number, repeat):
def wrapper(func):
runs = timeit.repeat(func, number=number, repeat=repeat)
print(sum(runs) / len(runs))
return wrapper
URL = "https://httpbin.org/uuid"
@jit(nopython=True, nogil=True,cache=True)
def fetch(session, url):
with session.get(url) as response:
print(response.json()['uuid'])
@timer(1, 1)
def runner():
with ThreadPoolExecutor(max_workers=25) as executor:
with requests.Session() as session:
executor.map(fetch, [session] * 100, [URL] * 100)
executor.shutdown(wait=True)
executor._adjust_thread_count
Maybe this might help.
I'm using ThreadPoolExecutor for multithreading. you can also use ProcessPoolExecutor.
For your compute expensive operation you can use numba for making cached byte code of your function for faster exeution.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.