[英]Updating shared read-only data with Python multiprocessing
I am attempting to use Python's multiprocessing library to experiment with distributed neural networks. 我正在尝试使用Python的多处理库来试验分布式神经网络。 At the moment, I have it set up so that a server process creates the neural network and chunks the input for mini-batch gradient descent with the batches being put into a shared queue, processed by a client process, and the result put into a separate shared queue.
目前,我已经进行了设置,以便服务器进程创建神经网络并分块输入以进行小批量梯度下降,并将批处理放入共享队列中,由客户端进程处理,并将结果放入单独的共享队列。
So far, everything is working except that in order to process the batches and produce a gradient, the child processes need a copy of the network weights, which I have shared using a multiprocessing Array. 到目前为止,一切工作正常,只是为了处理批次并产生渐变,子进程需要网络权重的副本,我已使用多处理数组共享了网络权重。 The client processes only need a read-only copy of the weights, but the server process updates the local copies after each training epoch.
客户端进程仅需要权重的只读副本,但是服务器进程在每个训练时期之后都会更新本地副本。
My question is how would I update the shared memory to reflect the changed weights so that on the next epoch, the client processes have the correct values for computing gradients. 我的问题是如何更新共享内存以反映更改的权重,以便在下一个时期,客户端进程具有用于计算梯度的正确值。
I have been playing with multiprocessing
since reading this and found that updating data in an mp.Array
is not too difficult - the bit that got me was the fact that access was not atomic when using loops to iterate the Array
. 自从阅读
mp.Array
以来,我一直在mp.Array
multiprocessing
工作,发现更新mp.Array
数据并不是太难-使我感到有点mp.Array
的事实是,使用循环迭代Array
时访问不是原子的。 The following snippet sets up a simple master-worker set using mp.Process
(using Pool
would have been nicer but this was faster for me) where an mp.Array
was used to synchronise data which the master would change frequently (as fast as it could) 下面的代码片段使用
mp.Process
设置了一个简单的master-worker集合(使用Pool
会更好,但是对我来说这更快),其中mp.Array
用于同步数据,该数据会被master频繁地更改(与它一样快)可以)
from multiprocessing import Process, RLock, Array
from time import sleep
def worker(n, array, arrayLock):
while True:
arrayLock.acquire()
print("Worker: %i -> %s" % (n, ",".join(str(i) for i in array)))
arrayLock.release()
sleep(n + 1)
if __name__ == '__main__':
arrayLock = RLock()
array = Array('i', range(10), lock=arrayLock)
pd = {}
for i in range(3):
pd[i] = Process(target=worker, args=(i, array, arrayLock))
pd[i].start()
try:
while True:
arrayLock.acquire()
for i in range(len(array)):
array[i] = -array[i]
arrayLock.release()
except KeyboardInterrupt:
pass
for p in pd.values():
p.terminate()
Resulting in the following output 产生以下输出
~> python mp_shared.py
Worker: 0 -> 0,1,2,3,4,5,6,7,8,9
Worker: 1 -> 0,-1,-2,-3,-4,-5,-6,-7,-8,-9
Worker: 2 -> 0,1,2,3,4,5,6,7,8,9
Worker: 0 -> 0,-1,-2,-3,-4,-5,-6,-7,-8,-9
Worker: 1 -> 0,-1,-2,-3,-4,-5,-6,-7,-8,-9
Worker: 0 -> 0,1,2,3,4,5,6,7,8,9
Updating the data across processes was simply a matter of changing the values in the Array
. 跨进程更新数据仅是更改
Array
值的问题。 I hit an issue where the result would look like this though (note the alternating signs of the data) 我遇到一个问题,尽管结果看起来像这样(请注意数据的交替符号)
Worker: 0 -> 0,-1,2,-3,4,-5,6,-7,8,-9
Worker: 1 -> 0,-1,2,-3,4,-5,6,-7,8,-9
Worker: 2 -> 0,-1,2,-3,4,-5,6,-7,8,-9
which was caused by the fact that the Lock
automatically created for the Array
would not synchronise access for the whole loop when I was reading from or writing to the array! 这是由于在为
Array
读写时,为Array
自动创建的Lock
不会同步整个循环的访问! The master process would be zipping in and out of the Array
making changes between lock acquisitions by the workers. 主进程将在
Array
拉入和拉出,从而在工作人员获取锁之间进行更改。
To avoid this, I just created my own RLock
(needs to be an RLock
as touching the Array
makes it acquire which will block if you have already acquired a Lock
) for use with the Array
. 为了避免这种情况,我刚刚创建了自己的
RLock
(需要一个RLock
因为触摸Array
会使其获取,如果您已经获得Lock
,它将阻塞,如果与Array
一起使用)。 I passed the RLock
to all the workers so they everyone could have atomic operations (in your situation, I am sure it is important that reads and writes are atomic to prevent errors in gradient calculation). 我将
RLock
传递给所有工人,这样他们每个人都可以进行原子操作(在您的情况下,我相信读写是原子的,以防止梯度计算中的错误很重要)。
Edit: 编辑:
Another alternative would appear to be mmap
but I can't comment on its use and if changes work as desired here. 另一个替代方案似乎是
mmap
但我无法评论其用途以及此处是否mmap
更改。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.