简体   繁体   中英

Updating shared read-only data with Python multiprocessing

I am attempting to use Python's multiprocessing library to experiment with distributed neural networks. At the moment, I have it set up so that a server process creates the neural network and chunks the input for mini-batch gradient descent with the batches being put into a shared queue, processed by a client process, and the result put into a separate shared queue.

So far, everything is working except that in order to process the batches and produce a gradient, the child processes need a copy of the network weights, which I have shared using a multiprocessing Array. The client processes only need a read-only copy of the weights, but the server process updates the local copies after each training epoch.

My question is how would I update the shared memory to reflect the changed weights so that on the next epoch, the client processes have the correct values for computing gradients.

I have been playing with multiprocessing since reading this and found that updating data in an mp.Array is not too difficult - the bit that got me was the fact that access was not atomic when using loops to iterate the Array . The following snippet sets up a simple master-worker set using mp.Process (using Pool would have been nicer but this was faster for me) where an mp.Array was used to synchronise data which the master would change frequently (as fast as it could)

from multiprocessing import Process, RLock, Array
from time import sleep

def worker(n, array, arrayLock):
    while True:
        arrayLock.acquire()
        print("Worker: %i -> %s" % (n, ",".join(str(i) for i in array)))
        arrayLock.release()
        sleep(n + 1)

if __name__ == '__main__':
    arrayLock = RLock()
    array = Array('i', range(10), lock=arrayLock)

    pd = {}
    for i in range(3):
        pd[i] = Process(target=worker, args=(i, array, arrayLock))
        pd[i].start()

    try:
        while True:
            arrayLock.acquire()
            for i in range(len(array)):
                array[i] = -array[i]
            arrayLock.release()
    except KeyboardInterrupt:
        pass

    for p in pd.values():
        p.terminate()

Resulting in the following output

~> python mp_shared.py
Worker: 0 -> 0,1,2,3,4,5,6,7,8,9
Worker: 1 -> 0,-1,-2,-3,-4,-5,-6,-7,-8,-9
Worker: 2 -> 0,1,2,3,4,5,6,7,8,9
Worker: 0 -> 0,-1,-2,-3,-4,-5,-6,-7,-8,-9
Worker: 1 -> 0,-1,-2,-3,-4,-5,-6,-7,-8,-9
Worker: 0 -> 0,1,2,3,4,5,6,7,8,9

Updating the data across processes was simply a matter of changing the values in the Array . I hit an issue where the result would look like this though (note the alternating signs of the data)

Worker: 0 -> 0,-1,2,-3,4,-5,6,-7,8,-9
Worker: 1 -> 0,-1,2,-3,4,-5,6,-7,8,-9
Worker: 2 -> 0,-1,2,-3,4,-5,6,-7,8,-9

which was caused by the fact that the Lock automatically created for the Array would not synchronise access for the whole loop when I was reading from or writing to the array! The master process would be zipping in and out of the Array making changes between lock acquisitions by the workers.

To avoid this, I just created my own RLock (needs to be an RLock as touching the Array makes it acquire which will block if you have already acquired a Lock ) for use with the Array . I passed the RLock to all the workers so they everyone could have atomic operations (in your situation, I am sure it is important that reads and writes are atomic to prevent errors in gradient calculation).

Edit:

Another alternative would appear to be mmap but I can't comment on its use and if changes work as desired here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM