简体   繁体   中英

Using local memory in Pool workers with python's multiprocessing module

I'm working on implementing a randomized algorithm in python. Since this involves doing the same thing many (say N) times, it rather naturally parallelizes and I would like to make use of that. More specifically, I want to distribute the N iterations on all of the cores of my CPU. The problem in question involves computing the maximum of something and is thus something where every worker could compute their own maximum and then only report that one back to the parent process, which then only needs to figure out the global maximum out of those few local maxima.

Somewhat surprisingly, this does not seem to be an intended use-case of the multiprocessing module, but I'm not entirely sure how else to go about it. After some research I came up with the following solution (toy problem to find the maximum in a list that is structurally the same as my actual one):

import random
import multiprocessing

l = []
N = 100
numCores = multiprocessing.cpu_count()

# globals for every worker
mySendPipe = None
myRecPipe = None

def doWork():
    pipes = zip(*[multiprocessing.Pipe() for i in range(numCores)])
    pool = multiprocessing.Pool(numCores, initializeWorker, (pipes,))
    pool.map(findMax, range(N))

    results = []
    # collate results
    for p in pipes[0]:
        if p.poll():
            results.append(p.recv())
    print(results)

    return max(results)

def initializeWorker(pipes):
    global mySendPipe, myRecPipe
    # ID of a worker process; they are consistently named PoolWorker-i
    myID = int(multiprocessing.current_process().name.split("-")[1])-1
    # Modulo: When starting a second pool for the second iteration of doWork() they are named with IDs 5-8.
    mySendPipe = pipes[1][myID%numCores]
    myRecPipe = pipes[0][myID%numCores]

def findMax(count):
    myMax = 0
    if myRecPipe.poll():
        myMax = myRecPipe.recv()
    value = random.choice(l)
    if myMax < value:
        myMax = value
    mySendPipe.send(myMax)

l = range(1, 1001)
random.shuffle(l)
max1 = doWork()
l = range(1001, 2001)
random.shuffle(l)
max2 = doWork()
return (max1, max2)

This works, sort of, but I've got a problem with it. Namely, using pipes to store the intermediate results feels rather silly (and is probably slow). But it also has the real problem, that I can't send arbitrarily large things through the pipe, and my application unfortunately sometimes exceeds this size (and deadlocks).

So, what I'd really like is a function analogue to the initializer that I can call once for every worker on the pool to return their local results to the parent process. I could not find such functionality, but maybe someone here has an idea?

A few final notes:

  • I use a global variable for the input because in my application the input is very large and I don't want to copy it to every process. Since the processes never write to it, I believe it should never be copied (or am I wrong there?). I'm open to suggestions to do this differently, but mind that I need to run this on changing inputs (sequentially though, just like in the example above).
  • I'd like to avoid using the Manager-class, since (by my understanding) it introduces synchronisation and locks, which in this problem should be completely unnecessary.

The only other similar question I could find is Python's multiprocessing and memory , but they wish to actually process the individual results of the workers, whereas I do not want the workers to return N things, but to instead only run a total of N times and return only their local best results.

I'm using Python 2.7.15.


tl;dr: Is there a way to use local memory for every worker process in a multiprocessing pool, so that every worker can compute a local optimum and the parent process only needs to worry about figuring out which one of those is best?

You might be overthinking this a little. By making your worker-functions (in this case findMax ) actually return a value instead of communicating it, you can store the result from calling pool.map() - it is just a parallel variant of map, after all! It will map a function over a list of inputs and return the list of results of that function call.

The simplest example illustrating my point follows your "distributed max" example:

import multiprocessing

# [0,1,2,3,4,5,6,7,8]
x = range(9)

# split the list into 3 chunks
# [(0, 1, 2), (3, 4, 5), (6, 7, 8)]
input = zip(*[iter(x)]*3)
pool = multiprocessing.Pool(2)
# compute the max of each chunk:
# max((0,1,2)) == 2
# max((3,4,5)) == 5
# ...
res = pool.map(max, input)
print(res)

This returns [2, 5, 8] . Be aware that there is some light magic going on: I use the built-in max() function which expects iterables as input. Now, if I would only pool.map over a plain list of integers, say, range(9) , that would result in calls to max(0) , max(1) etc. - not very useful, huh? Instead, I partition my list into chunks, so effectively, when mapping, we now map over a list of tuples , thus feeding a tuple to max on each call.

So perhaps you have to:

  • return a value from your worker func
  • think about how you structure your input domain so that you feed meaningful chunks to each worker

PS: You wrote a great first question! Thank you, it was a pleasure reading it :) Welcome to StackOverflow!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM