简体   繁体   English

如何使进程能够在主程序的数组中写入?

[英]How do I make processes able to write in an array of the main program?

I am making a process pool and each of them need to write in different parts of a matrix that exists in the main program. 我正在创建一个进程池,每个进程池都需要在主程序中存在的矩阵的不同部分进行编写。 There exists no fear of overwriting information as each process will work with different rows of the matrix. 不存在覆盖信息的担心,因为每个过程将与矩阵的不同行一起工作。 How can i make the matrix writable from within the processes?? 如何使矩阵在流程中可写?

The program is a matrix multiplier a professor assigned me and has to be multiprocessed. 该程序是教授指定我的矩阵乘数,必须进行多处理。 It will create a process for every core the computer has. 它将为计算机的每个核心创建一个进程。 The main program will send different parts of the matrix to the processes and they will compute them, then they will return them in a way i can identify which response corresponds to which row it was based on. 主程序将矩阵的不同部分发送给进程并且它们将计算它们,然后它们将以一种方式返回它们,我可以识别哪个响应对应于它所基于的行。

Have you tried using multiprocessing.Array class to establish some shared memory? 您是否尝试过使用multiprocessing.Array类来建立一些共享内存?

See also the example from the docs : 另请参阅文档中的示例:

from multiprocessing import Process, Value, Array

def f(n, a):
    n.value = 3.1415927
    for i in range(len(a)):
        a[i] = -a[i]

if __name__ == '__main__':
    num = Value('d', 0.0)
    arr = Array('i', range(10))

    p = Process(target=f, args=(num, arr))
    p.start()
    p.join()

    print num.value
    print arr[:]

Just extend this to a matrix of size h*w with i*w+j -style indexing. 只需使用i*w+j style索引将其扩展为大小为h*w的矩阵。 Then, add multiple processes using a Process Pool . 然后,使用Process Pool添加多个进程。

The cost of creating of new processes or copying matrices between them if processes are reused overshadows the cost of matrix multiplication. 如果重复使用进程,则创建新进程或在它们之间复制矩阵的成本会超出矩阵乘法的成本。 Anyway numpy.dot() can utilize different CPU cores by itself. 无论如何numpy.dot()可以自己使用不同的CPU核心。

Matrix multiplication can be distributed between processes by computing different rows of the result in different processes, eg, given input matrices a and b then the result (i,j) element is: 通过在不同的过程中计算结果的不同行,可以在进程之间分配矩阵乘法,例如,给定输入矩阵ab然后结果(i,j)元素是:

out[i,j] = sum(a[i,:] * b[:,j])

So i -th row can be computed as: 所以第i行可以计算为:

import numpy as np

def dot_slice(a, b, out, i):
    t = np.empty_like(a[i,:])
    for j in xrange(b.shape[1]):
        # out[i,j] = sum(a[i,:] * b[:,j])
        np.multiply(a[i,:], b[:,j], t).sum(axis=1, out=out[i,j])

numpy array accepts a slice as an index, eg, a[1:3,:] returns the 2nd and 3rd rows. numpy数组接受切片作为索引,例如, a[1:3,:]返回第2行和第3行。

a , b are readonly so they can be inherited as is by child processes ( exploiting copy-on-write on Linux ), the result is computed using shared array. ab是只读的,因此可以通过子进程继承它们( 在Linux上利用copy-on-write ),结果使用共享数组计算。 Only indexes are copied during computations: 在计算期间仅复制索引:

import ctypes
import multiprocessing as mp

def dot(a, b, nprocesses=mp.cpu_count()):
    """Perform matrix multiplication using multiple processes."""
    if (a.shape[1] != b.shape[0]):
        raise ValueError("wrong shape")

    # create shared array
    mp_arr = mp.RawArray(ctypes.c_double, a.shape[0]*b.shape[1])

    # start processes
    np_args = mp_arr, (a.shape[0], b.shape[1]), a.dtype
    pool = mp.Pool(nprocesses, initializer=init, initargs=(a, b)+np_args)

    # perform multiplication
    for i in pool.imap_unordered(mpdot_slice, slices(a.shape[0], nprocesses)):
        print("done %s" % (i,))
    pool.close()
    pool.join()

    # return result
    return tonumpyarray(*np_args)

Where: 哪里:

def mpdot_slice(i):
    dot_slice(ga, gb, gout, i)
    return i

def init(a, b, *np_args):
    """Called on each child process initialization."""
    global ga, gb, gout
    ga, gb = a, b
    gout = tonumpyarray(*np_args)

def tonumpyarray(mp_arr, shape, dtype):
    """Convert shared multiprocessing array to numpy array.

    no data copying
    """
    return np.frombuffer(mp_arr, dtype=dtype).reshape(shape)

def slices(nitems, mslices):
    """Split nitems on mslices pieces.

    >>> list(slices(10, 3))
    [slice(0, 4, None), slice(4, 8, None), slice(8, 10, None)]
    >>> list(slices(1, 3))
    [slice(0, 1, None), slice(1, 1, None), slice(2, 1, None)]
    """
    step = nitems // mslices + 1
    for i in xrange(mslices):
        yield slice(i*step, min(nitems, (i+1)*step))

To test it: 测试它:

def test():
    n = 100000
    a = np.random.rand(50, n)
    b = np.random.rand(n, 60)
    assert np.allclose(np.dot(a,b), dot(a,b, nprocesses=2))

On Linux this multiprocessing version has the same performance as the solution that uses threads and releases GIL (in the C extension) during computations : 在Linux上,这个多处理版本具有与使用线程并在计算期间释放GIL(在C扩展中)的解决方案相同的性能:

$ python -mtimeit -s'from test_cydot import a,b,out,np' 'np.dot(a,b,out)'
100 loops, best of 3: 9.05 msec per loop

$ python -mtimeit -s'from test_cydot import a,b,out,cydot' 'cydot.dot(a,b,out)' 
10 loops, best of 3: 88.8 msec per loop

$ python -mtimeit -s'from test_cydot import a,b; import mpdot' 'mpdot.dot(a,b)'
done slice(49, 50, None)
..[snip]..
done slice(35, 42, None)
10 loops, best of 3: 82.3 msec per loop

Note: the test was changed to use np.float64 everywhere. 注意:测试已更改为在任何地方使用np.float64

Matrix multiplication means each element of the resulting matrix is calculated separately. 矩阵乘法意味着得到的矩阵的每个元素都是单独计算的。 That seems like a job for Pool . 这似乎是Pool的工作。 Since it's homework (and also to follow the SO code) I will only illustrate the use of the Pool itself, not the whole solution. 既然它是家庭作业(也是遵循SO代码),我只会说明池本身的使用,而不是整个解决方案。

So, you have to write a routine to calculate the (i, j)-th element of the resulting matrix: 因此,您必须编写一个例程来计算结果矩阵的第(i,j)个元素:

def getProductElement(m1, m2, i, j):
    # some calculations
    return element

Then you initialize the Pool: 然后初始化池:

from multiprocessing import Pool, cpu_count
pool = Pool(processes=cpu_count())

Then you need to submit the jobs. 然后你需要提交工作。 You can organize them in a matrix, too, but why bother, let's just make a list. 你也可以用矩阵来组织它们,但为什么还要麻烦,让我们只列一个清单。

result = []
# here you need to iterate through the the columns of the first and the rows of
# the second matrix. How you do it, depends on the implementation (how you store
# the matrices). Also, make sure you check the dimensions are the same.
# The simplest case is if you have a list of columns:

N = len(m1)
M = len(m2[0])
for i in range(N):
    for j in range(M):
        results.append(pool.apply_async(getProductElement, (m1, m2, i, j)))

Then fill the resulting matrix with the results: 然后用结果填充结果矩阵:

m = []
count = 0
for i in range(N):
    column = []
    for j in range(M):
        column.append(results[count].get())
    m.append(column)

Again, the exact shape of the code depends on how you represent the matrices. 同样,代码的确切形状取决于您如何表示矩阵。

You don't. 你没有。

Either they return their edits in a format you can use in the main programme, or you use some kind of interprocess-communication to have them send their edits over, or you use some kind of shared storage, such as a database, or a datastructure server like redis. 要么以可以在主程序中使用的格式返回编辑,要么使用某种进程间通信让它们发送编辑,或者使用某种共享存储,例如数据库或数据结构像redis这样的服务器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM