简体   繁体   English

具有共享numpy数组的Python多处理

[英]Python multiprocessing with shared numpy array

Suppose I created an object A with an 2 dimension numpy array as attributes. 假设我创建了一个具有2维numpy数组作为属性的对象A。 Then I created 10 threads using Process API to randomly set the rows of A. 然后,我使用Process API创建了10个线程来随机设置A的行。

I want to know if I write the following code, whether self.x if shared among all the Process(thread), or every Process(thread) has just a copy? 我想知道是否编写以下代码,是否在所有Process(线程)之间共享的self.x或每个Process(线程)都有一个副本?

If not shared, I will lose all the updates, right? 如果不共享,我将丢失所有更新,对吗?

import numpy as np
from multiprocessing import Process

class A:

   def __init__():
       self.x = np.zeros((3,4))

   def update():
        threads = []
        for i in range(10):
            trd = Process(target=self.set, args=(i,))
        threads.append(trd)
        trd.start()

        for i in range(10):
            threads[i].join()

   def set(i):
       self.x[i/3] = np.random.rand(1,4)


if ___main___:
        a = A()
        a.update()

No, it is not shared. 不,它不是共享的。 You spawn multiple processes with each process copying the file descriptor of the parent process and with no shared object. 您生成多个进程,每个进程都复制父进程的文件描述符,并且没有共享库

To create shared a shared variable you have use ctype objects. 要创建共享共享变量,您必须使用ctype对象。

So instead of declaring the array as - 因此,与其将数组声明为-

self.x = np.zeros((3,4))

you can declare it using this Array - 您可以使用此数组进行声明-

from multiprocessing import Array
self.x = Array('i', [0]*10)

If still you want to make the numpy array a shared array, have a look at this great answer . 如果仍然要将numpy数组设为共享数组,请查看此好答案

The caveat here is, it might not be that easy. 需要注意的是,这可能并不容易。 You'll also have to lock the shared array to avoid any race condition. 您还必须锁定共享阵列,以避免出现任何竞争情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM