如何将多维数组从 C 中的共享库返回到 python 作为 np.array？

Question

I am currently trying to figure a way to return a mutidimensional array (of doubles) from a shared library in C to python and make it an np.array.我目前正在尝试想出一种方法来从 C 中的共享库返回一个多维数组（双精度）到 python 并使其成为 np.array。 My current approach looks like this:我目前的方法是这样的：

shared library ("utils.c")共享库（“utils.c”）

#include <stdio.h>

void somefunction(double *inputMatrix, int d1_inputMatrix, int d2_inputMatrix, int h_inputMatrix, int w_inputMatrix, double *kernel, int d1_kernel, int d2_kernel, int h_kernel, int w_kernel, int stride) {
    
    double result[d1_kernel][d2_kernel][d2_inputMatrix][h_inputMatrix-h_kernel+1][w_inputMatrix-w_kernel+1];
    // ---some operation--

    return result;

}

Now, I compile utils.c with: cc -fPIC -shared -o utils.so utils.c现在，我用以下命令编译 utils.c： cc -fPIC -shared -o utils.so utils.c

python ("somefile.py") python ("somefile.py")

from ctypes import *
import numpy as np

so_file = "/home/benni/Coding/5.PK/Code/utils.so"
utils = CDLL(so_file)

INT = c_int64
ND_POINTER_4 = np.ctypeslib.ndpointer(dtype=np.float64, ndim=4, flags="C")

utils.convolve.argtypes = [ND_POINTER_4, INT, INT, INT, INT, ND_POINTER_4, INT, INT, INT, INT, INT]
a = ... #some np.array  with 4 dimensions
b = ... #some np.array  with 4 dimensions

result = utils.somefunction(a, a.shape[0], a.shape[1], a.shape[2], a.shape[3], b, b.shape[0], b.shape[1], b.shape[2], b.shape[3], 1)

Now, how do I convert the result of utils.somefunction() to an np.array?现在，如何将 utils.somefunction() 的结果转换为 np.array？ I know that, in order to solve my problem, I have to specify utils.convolve.restype.我知道，为了解决我的问题，我必须指定 utils.convolve.restype。 But what do I have to put for restype if I want the return type to be an np.array?但是，如果我希望返回类型为 np.array，我必须为 restype 设置什么？

Answer 1

First of all, a scoped C array allocated on the stack (like in somefunction ) must never be returned by a function. The space of the stack will be reused by other function like the one of CPython for example.首先，在堆栈上分配的范围为 C 的数组（如在somefunction中）绝不能由 function 返回。堆栈的空间将被其他 function 重用，例如 CPython 中的一个。 The returned array must be allocated on the heap instead.返回的数组必须分配在堆上。

Moreover, writing a function working with Numpy arrays using ctypes is pretty cumbersome.此外，使用 ctypes 编写 function 和 Numpy arrays 是非常麻烦的。 As you found out, you need to pass the full shape in parameter.正如您所发现的，您需要在参数中传递完整的形状。 But the thing is you also need to pass the strides for each dimension and for each input arrays in parameter of the function since they may not be contiguous in memory (for example np.transpose change this).但问题是您还需要在 function 的参数中传递每个维度和每个输入 arrays 的步幅，因为它们在 memory 中可能不连续（例如np.transpose更改此）。 That being said, we can assume that the input array is contiguous for sake of performance and sanity.话虽如此，为了性能和理智，我们可以假设输入数组是连续的。 This can be enforced with np.ascontiguousarray .这可以通过np.ascontiguousarray强制执行。 The pointer of the views a and b can be extracted using numpy.ctypeslib.as_ctypes , but hopefully ctype can do that automatically.可以使用numpy.ctypeslib.as_ctypes提取视图a和b的指针，但希望 ctype 可以自动执行此操作。 Furthermore, the returned array is currently a C pointer and not a Numpy array.此外，返回的数组当前是 C 指针，而不是 Numpy 数组。 Thus, you need to create a Numpy array with the right shape and strides from it numpy.ctypeslib.as_array .因此，您需要创建一个形状正确的 Numpy 数组，并从它numpy.ctypeslib.as_array 。 Because the resulting shape is not known from the caller, you need to retrieve it from the callee function using several integer pointers (one per dimension).由于调用方不知道生成的形状，因此您需要使用多个 integer 指针（每个维度一个）从被调用方 function 检索它。 In the end, this results in a pretty-big ugly highly-bug-prone code (which will often silently crash if anything goes wrong not to mention the possible memory leaks if you do not pay attention to that).最后，这会导致非常丑陋且极易出错的代码（如果出现任何问题，代码通常会悄无声息地崩溃，更不用说如果您不注意的话可能会发生 memory 泄漏）。 You can use Cython to do most of this work for you.您可以使用Cython为您完成大部分工作。

Assuming you do not want to use Cython or you cannot, here is an example code with ctypes:假设你不想使用 Cython 或者你不能，这里是一个带有 ctypes 的示例代码：

import ctypes
import numpy as np

# Example of input
a = np.empty((16, 16, 12, 12), dtype=np.float64)
b = np.empty((8, 8, 4, 4), dtype=np.float64)

# Better than CDLL regarding the Numpy documentation.
# Here the DLL/SO file is found in:
# Windows:  ".\utils.dll"
# Linux:    "./libutils.so"
utils = np.ctypeslib.load_library('utils', '.')

INT = ctypes.c_int64
PINT = ctypes.POINTER(ctypes.c_int64)
PDOUBLE = ctypes.POINTER(ctypes.c_double)
ND_POINTER_4 = np.ctypeslib.ndpointer(dtype=np.float64, ndim=4, flags="C_CONTIGUOUS")

utils.somefunction.argtypes = [
    ND_POINTER_4, INT, INT, INT, INT, 
    ND_POINTER_4, INT, INT, INT, INT, 
    PINT, PINT, PINT, PINT, PINT
]
utils.somefunction.restype = PDOUBLE

d1_out, d2_out, d3_out, d4_out, d5_out = INT(), INT(), INT(), INT(), INT()
p_d1_out = ctypes.pointer(d1_out)
p_d2_out = ctypes.pointer(d2_out)
p_d3_out = ctypes.pointer(d3_out)
p_d4_out = ctypes.pointer(d4_out)
p_d5_out = ctypes.pointer(d5_out)
out = utils.somefunction(a, a.shape[0], a.shape[1], a.shape[2], a.shape[3],
                         b, b.shape[0], b.shape[1], b.shape[2], b.shape[3],
                         p_d1_out, p_d2_out, p_d3_out, p_d4_out, p_d5_out)
d1_out = d1_out.value
d2_out = d2_out.value
d3_out = d3_out.value
d4_out = d4_out.value
d5_out = d5_out.value
result = np.ctypeslib.as_array(out, shape=(d1_out, d2_out, d3_out, d4_out, d5_out))

# Some operations

# WARNING: 
# You should free the memory of the allocated buffer 
# with `free(out)` when you are done with `result` 
# since Numpy does not free it for you: it just creates 
# a view and does not take the ownership.
# Note that the right libc must be used, otherwise the
# call to free will cause an undefined behaviour
# (eg. crash, error message, nothing)

Here is the C code (be aware of the fixed-length types):这是 C 代码（注意固定长度类型）：

/* utils.c */

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>

double* somefunction(
        double* inputMatrix, int64_t d1_inputMatrix, int64_t d2_inputMatrix, int64_t h_inputMatrix, int64_t w_inputMatrix, 
        double* kernel, int64_t d1_kernel, int64_t d2_kernel, int64_t h_kernel, int64_t w_kernel,
        int64_t* d1_out, int64_t* d2_out, int64_t* d3_out, int64_t* d4_out, int64_t* d5_out
    )
{
    *d1_out = d1_kernel;
    *d2_out = d2_kernel;
    *d3_out = d2_inputMatrix;
    *d4_out = h_inputMatrix - h_kernel + 1;
    *d5_out = w_inputMatrix - w_kernel + 1;

    const size_t size = *d1_out * *d2_out * *d3_out * *d4_out * *d5_out;
    double* result = malloc(size * sizeof(double));

    if(result == NULL)
    {
        fprintf(stderr, "Unable to allocate an array of %d bytes", size * sizeof(double));
        return NULL;
    }

    /* Some operation: fill `result` */

    return result;
}

Here is the command used to build the library with GCC:以下是用于使用 GCC 构建库的命令：

# On Windows
gcc utils.c -shared -o utils.dll

# On Linux
gcc utils.c -fPIC -shared -o libutils.so

For more information, please read this:有关更多信息，请阅读以下内容：

如何将多维数组从 C 中的共享库返回到 python 作为 np.array？

问题描述

1 个解决方案

解决方案1
3 已采纳 2022-02-20 18:13:11

如何将多维数组从 C 中的共享库返回到 python 作为 np.array？

问题描述

1 个解决方案

解决方案1 3 已采纳 2022-02-20 18:13:11

解决方案1
3 已采纳 2022-02-20 18:13:11