I have the following code
import multiprocessing as mp
import os
def funct(name):
if nameisvalid:
do_some_stuff_and_save_a_file
return 1
else:
return 0
num_proc = 20 #or a call to slurm/mp for number of processors
pool = mp.Pool(processes=num_proc)
results = pool.map_async(makeminofname, [n for n in nameindex])
pool.close()
pool.join()
I have run this on my desktop with a 6-core processor with num_proc=mp.cpu_count()
and it works fine and fast, but when I try to run this script in an sbatch script on our processing cluster, with -N 1 -n 20 (our nodes each have 24 processors), or any number of processors, it runs incredibly slow and only appears to utilize between 10-15 processors. Is there some way to optimize multiprocessing for working with slurm?
funct
checked the disk for a specific file, then loaded a file, then did work, then saved a file. This caused my individual processes to be waiting for input/output operations instead of working. So I loaded all of the initial data before passing it to the pool, and added a Process
from multiprocessing
dedicated to saving files from a Queue
that the pooled processes put their output into, so there is only ever one process trying to save.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.