简体   繁体   English

返回指向 Python 数组的 2D Cython 指针

[英]Return a 2D Cython pointer to Python array

I am currently passing from Cython to C the following pointer of a pointer:我目前正在从 Cython 向 C 传递一个指针的以下指针:

    #convert the input Python 2D array to a memory view
    cdef double[:,:] a_cython= np.asarray(a,order="C")
    #define a pointer of a pointer with dimensions of a
    cdef double** point_to_a = <double **>malloc(N * sizeof(double*))
    #initialize the pointer
    if not point_to_a: raise MemoryError
    #try:
    for i in range(N):
        point_to_a[i] = &a_cython[i, 0]

    #pass this double pointer to a C function
    logistic_sigmoid(&point_to_a[0], N,M)

where a is a numpy array, whose dimensions are N x M , point_to_a is a Cython pointer of a pointer which is referring to Cython memoryview a_cython .其中a是一个 numpy 数组,其维度为N x Mpoint_to_a是指向 Cython 内存视图 a_cython 的指针的a_cython指针。 Since the input a from Python is 2 dimensional array, I thought this was the best approach to pass the info directly to C. The passage goes smoothly and the computation is done correctly.由于来自 Python 的输入a是二维数组,我认为这是将信息直接传递给 C 的最佳方法。通道进行得很顺利,计算也正确完成。 However, I am trying now to re-convert back point_to_a to a numpy array, but I am struggling a bit.但是,我现在正在尝试将 point_to_a 重新转换回point_to_a数组,但我有点挣扎。

I am considering various solutions.我正在考虑各种解决方案。 I would like to explore if it's possible to keep a N dimensional array throughout the entire process, thus I was experimenting with this approach in Cython:我想探索是否可以在整个过程中保留一个 N 维数组,因此我在 Cython 中试验了这种方法:

    #define a integer array for dimensions
    cdef np.npy_intp dims[2]
    dims[0]=  N
    dims[1] = M
    #create a new memory view and PyArray_SimpleNewFromData to deal with the pointer
    cdef np.ndarray[double, ndim=2] new_a =  np.PyArray_SimpleNewFromData(2, &dims[0], np.NPY_DOUBLE, point_to_a)

however, when I am converting new_a to a np.array as array = np.asarray(new_a) I have an array with 0s only.但是,当我将new_a转换为 np.array as array = np.asarray(new_a)我有一个只有 0 的数组。 Do you have any ideas?你有什么想法?

Thanks very much非常感谢

As soon as you use int** (or similar) your data is in so-called indirect memory layout.一旦您使用int** (或类似的),您的数据就会处于所谓的间接 memory 布局中。 Cython's typed memory views support indirect memory layout (see for example Cython: understanding a typed memoryview with a indirect_contignuous memory layout ), however there are not so many classes implementing this interface. Cython 的类型化 memory 视图支持间接 memory 布局(参见示例Cython:了解具有间接连续 memory 布局的类型化内存视图),但是实现此接口的类并不多。

Numpy's ndarrays do not implement indirect memory layout - they only support direct memory layouts (eg pointer of type int* and not int** ), so passing an int** to a numpy array will do no good. Numpy 的 ndarrays 不实现间接 memory 布局——它们只支持直接 memory 布局(例如int*类型的指针而不是int** ),因此将int**传递给 numpy 数组不会有任何好处。

The good thing is, that because you share the memory with a_cython , the values were already updated in-place.好消息是,因为您与 a_cython 共享a_cython ,所以值已经就地更新。 You can get the underlying numpy array by returning the base -object of the typed memory view, ie您可以通过返回类型化的 memory 视图的base对象来获取底层 numpy 数组,即

return a_cython.base # returns 2d-numpy array.

there is no need to copy memory at all!根本不需要复制 memory !


There are however some issues with memory management (eg you need to free point_to_a ).但是 memory 管理存在一些问题(例如,您需要释放point_to_a )。

This is maybe an overkill in your case, but I use the opportunity to shamelessly plug-in a library of mine indirect_buffer : Because alternatives for indirect memory layout buffers are scarce and from time to time one needs one, I've create one to avoid writing always the same code.在你的情况下,这可能是一种矫枉过正,但我利用这个机会无耻地插入了我的库indirect_buffer :因为间接 memory 布局缓冲区的替代品很少而且有时需要一个,我创建了一个以避免总是写同样的代码。

With indirect_buffer your function could look like following:使用indirect_buffer ,您的 function 可能如下所示:

%%cython
#just an example for a c-function
cdef extern from *:
    """
    void fillit(int** ptr, int N, int M){
       int cnt=0;
       for(int i=0;i<N;i++){
          for(int j=0;j<M;j++){
            ptr[i][j]=cnt++;
          }
       }
    }
    """
    void fillit(int** ptr, int N, int M)

from indirect_buffer.buffer_impl cimport IndirectMemory2D
def py_fillit(a):
    #create collection, it is a view of a
    indirect_view=IndirectMemory2D.cy_view_from_rows(a, readonly=False)
    fillit(<int**>indirect_view.ptr, indirect_view.shape[0], indirect_view.shape[1])
    # values are updated directly in a

which now can be used, for example:现在可以使用,例如:

import numpy as np
a=np.zeros((3,4), dtype=np.int32)
py_fillit(a)
print(a)
# prints as expected:
#  array([[ 0,  1,  2,  3],
#         [ 4,  5,  6,  7],
#         [ 8,  9, 10, 11]])

The above version does a lot of things right: memory management, locking of buffers and so on.上面的版本做了很多正确的事情:memory 管理,缓冲区锁定等等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM