简体   繁体   中英

Python multiprocessing Pool returns same output in for loop

I am using multiprocessing.Pool in Python 3.6, through Spyder 3.6.5 on a Windows 10 machine. The aim is to get the outputs from a simple squared function by importing multiple inputs (in this example only 4 values are included for practical issues). The below code works fine:

import numpy as np
import multiprocessing

from multiprocessing import Pool

data=[]
data.append(np.array([1,2]))
data.append(np.array([4,5]))

Output=np.zeros((2,2))


for i in range (0,2):

    data1=data[i]

    def square(x):
        return x*x

    if __name__ == '__main__':
        __spec__ = "ModuleSpec(name='builtins', loader=<class '_frozen_importlib.BuiltinImporter'>)"
        with Pool(multiprocessing.cpu_count()) as p:
            output = p.map(square, data1, chunksize=10)
            p.close()
            output=np.asarray(output)
            Output[i]=output

while in case when I want to specify input square function values (x) as:

def square(ii):
    x=data1[ii]
    return x*x

the for loop runs two times (due to 'for i in range (0,2)') but the results of p.map are the same in every run and equal to the second run, ie instead of being Output=np.array([[1,4],[16,25]]) I am getting Output=np.array([[16,25],[16,25]]). It seems as the for loop runs two times with i=1 and not as in the first loop i=0 and in the second i=1.

Any ideas of what I am doing wrong?

Closures in python don't copy values of variables they close over. They just keep a reference to that scope. In your second try, the square function is accessing data1 as if it was different value in both iterations, but it is in fact a reference to the same underlying variable. By the time the multiprocessing module fires up a new process and calls square , the variable has already changed.

Try this for instance:

res = []
for i in range(5):
    def square():
        return i * i
    res.append(square)


print([f() for f in res])

As for a solution you could manually create a new scope with the appropriate value in each iteration.

A function call creates a new scope, so you could define a helper function like this outside the loop

def square_creater(data1):
    def square(ii):
        x = data1[ii]
        return x * x
    return square

and then at each iteration use

square = square_creater(data1)

This should work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM