I have a code piece like below
pool = multiprocessing.Pool(10)
for i in range(300):
for m in range(500):
data = do_some_calculation(resource)
pool.apply_async(paralized_func, data, call_back=update_resource)
# need to wait for all processes finish
# {...}
# Summarize resource
do_something_with_resource(resource)
So basically I have 2 loops. I init process pool once outside the loops to avoid overheating. At the end of 2nd loop, I want to summarize the result of all processes.
Problem is that I can't use pool.map()
to wait because of variation of data
input. I can't use pool.join()
and pool.close()
either because I still need to use the pool
in next iteration of 1st loop.
What is the good way to wait for processes to finish in this case?
I tried checking for pool._cache at the end of 2nd loop.
while len(process_pool._cache) > 0:
sleep(0.001)
This way works but look weird. Is there a better way to do this?
apply_async
will return an AsyncResult
object. This object has a method wait([timeout])
, you can use it.
Example:
pool = multiprocessing.Pool(10)
for i in range(300):
results = []
for m in range(500):
data = do_some_calculation(resource)
result = pool.apply_async(paralized_func, data, call_back=update_resource)
results.append(result)
[result.wait() for result in results]
# need to wait for all processes finish
# {...}
# Summarize resource
do_something_with_resource(resource)
I haven't checked this code as it is not executable, but it should work.
Or you can use a callback to record how many returns you have got.
pool = multiprocessing.Pool(10)
for i in range(300):
results = 0
for m in range(500):
data = do_some_calculation(resource)
result = pool.apply_async(paralized_func, data, call_back=lambda x: results+=1; )
results.append(result)
# need to wait for all processes finish
while results < 500:
pass
# Summarize resource
do_something_with_resource(resource)
There's an issue with most upvoted answer
[result.wait() for result in results]
will not work as a roadblock in case some of the workers raised an exception. Exception considered sufficient case to proceed further for wait(). Here's possible check if all workers finished processing.
while True:
time.sleep(1)
# catch exception if results are not ready yet
try:
ready = [result.ready() for result in results]
successful = [result.successful() for result in results]
except Exception:
continue
# exit loop if all tasks returned success
if all(successful):
break
# raise exception reporting exceptions received from workers
if all(ready) and not all(successful):
raise Exception(f'Workers raised following exceptions {[result._value for result in results if not result.successful()]}')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.