Unable to limit number of processes in multiprocessing.Pool()

Question

I am trying to write a program which applies a certain function hundreds of times. In order to speed up the process, instead of doing it sequentially, I am trying to parallelize execution of the function. What I am doing, is the following:

import multiprocessing

def start_process():
    logger.debug('Starting {0}'.format(multiprocessing.current_process().name))


pool_size = len(inputs) if len(inputs) < multiprocessing.cpu_count() - 1 else multiprocessing.cpu_count() - 1

with multiprocessing.Pool(processes=pool_size, initializer=start_process) as pool:
    for o, ipt in pool.imap_unordered(train_func, inputs):
        output[(ipt[0], ipt[2])] = o

In the above code, I am using the argument initializer in order to be able to keep track of the number of processes that should be spawned. Moreover, the function train_func is a function that runs an optimization.

I am running the code on a server that has a maximum of 32 processors available at any time. Even though I would expect the number of "spawned" processes to reach a maximum of 31, I can see that more than 200-300 processes are spawned, and the program eventually crashes.

Moreover, I get the following error:

ERROR; return code from pthread_create() is 11
ERROR; return code from pthread_create() is 11
        Error detail: Resource temporarily unavailable
        Error detail: Resource temporarily unavailable
OMP: Error #34: System unable to allocate necessary resources for OMP thread:
OMP: System error #11: Resource temporarily unavailable
OMP: Hint Try decreasing the value of OMP_NUM_THREADS.
/bin/sh: fork: retry: No child processes
ERROR; return code from pthread_create() is 11
        Error detail: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes
OMP: Error #34: System unable to allocate necessary resources for OMP thread:
OMP: System error #11: Resource temporarily unavailable
OMP: Hint Try decreasing the value of OMP_NUM_THREADS.

Could you please provide any hint as to how I can indeed limit the number of processes spawned?

Answer 1

You should probably use the min builtin function.

Also, the initialization code must be wrapped in if __name__ == '__main__': , as described in the multiprocessing docs.

So all in all something like

import multiprocessing


def start_process():
    logger.debug(
        "Starting {0}".format(multiprocessing.current_process().name)
    )


def train_func(*args):
    pass


def main():
    pool_size = min(len(inputs), multiprocessing.cpu_count() - 1)

    with multiprocessing.Pool(
        processes=pool_size, initializer=start_process
    ) as pool:
        for o, ipt in pool.imap_unordered(train_func, inputs):
            output[(ipt[0], ipt[2])] = o


if __name__ == "__main__":
    main()

Unable to limit number of processes in multiprocessing.Pool()

Question

1 answers

solution1
0 2019-03-06 17:38:17

Unable to limit number of processes in multiprocessing.Pool()

Question

1 answers

solution1 0 2019-03-06 17:38:17

solution1
0 2019-03-06 17:38:17