Python multiprocessing running faster locally than on cluster (slurm)

Question

I have the following code

import multiprocessing as mp
import os

def funct(name):
    if nameisvalid:
        do_some_stuff_and_save_a_file
        return 1
    else:
        return 0

num_proc = 20 #or a call to slurm/mp for number of processors
pool = mp.Pool(processes=num_proc)
results = pool.map_async(makeminofname, [n for n in nameindex])
pool.close()
pool.join()

I have run this on my desktop with a 6-core processor with num_proc=mp.cpu_count() and it works fine and fast, but when I try to run this script in an sbatch script on our processing cluster, with -N 1 -n 20 (our nodes each have 24 processors), or any number of processors, it runs incredibly slow and only appears to utilize between 10-15 processors. Is there some way to optimize multiprocessing for working with slurm?

Answer 1

funct checked the disk for a specific file, then loaded a file, then did work, then saved a file. This caused my individual processes to be waiting for input/output operations instead of working. So I loaded all of the initial data before passing it to the pool, and added a Process from multiprocessing dedicated to saving files from a Queue that the pooled processes put their output into, so there is only ever one process trying to save.

Python multiprocessing running faster locally than on cluster (slurm)

Question

1 answers

solution1
1 2015-11-25 01:55:51

Python multiprocessing running faster locally than on cluster (slurm)

Question

1 answers

solution1 1 2015-11-25 01:55:51

solution1
1 2015-11-25 01:55:51