简体   繁体   中英

Multiprocessing Code not producing results

I am running the below code in Python. The code shows the example of how to use multiprocessing in Python. But it is not printing the result (line 16), Only the dataset (line 9) is getting printed and the kernel keeps on running. I am of the opinion that the results should be produced instantly as the code is using various cores of the cpu through multiprocessing so fast execution of the code should be there. Can someone tell what is the issue??

from multiprocessing import Pool

def square(x):
    # calculate the square of the value of x
    return x*x

if __name__ == '__main__':

    # Define the dataset
    dataset = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

    # Output the dataset
    print ('Dataset: ' + str(dataset)) # Line 9

    # Run this with a pool of 5 agents having a chunksize of 3 until finished
    agents = 5
    chunksize = 3
    with Pool(processes=agents) as pool:
        result = pool.map(square, dataset, chunksize)

    # Output the result
    print ('Result:  ' + str(result))  # Line 16

There is a known bug in older versions of python and Windows10 https://bugs.python.org/issue35797

It occurs when using multiprocessing through venv. Bugfix is released in Python 3.7.3.

You are running under Windows. Look at the console output where you started up Jupyter. You will see 5 instances (you have 5 processes in your pool) of:

AttributeError: Can't get attribute 'square' on <module ' main ' (built-in)>

You need to put your worker function in a file, for instance, workers.py , in the same directory as your Jupyter code:

workers.py

def square(x):
    # calculate the square of the value of x
    return x*xdef square(x):

Then remove the above function from your cell and instead add:

import workers

and then:

with Pool(processes=agents) as pool:
    result = pool.map(workers.square, dataset, chunksize)

Note

See my comment(s) to your post concerning the chunksize argument to the Pool constructor. The default chunksize for the map method when None is calculated more or less as follows based on the iterable argument:

if not hasattr(iterable, '__len__'):
    iterable = list(iterable)

if chunksize is None:
    chunksize, extra = divmod(len(iterable), len(self._pool) * 4)
    if extra:
        chunksize += 1

So I mispoke in one deatil: Regardless of whether you specify a chunksize or not, map will convert your iterable to a list if it needs to in order to get its length. But essentially the default chunksize used is the ceiling of the number of jobs being submitted divided by 4 times the pool size.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM