简体   繁体   English

Cython:使用融合类型在一个参数中传递多个numpy数组

[英]Cython: Passing multiple numpy arrays in one argument with fused types

I have rewritten an algorithm from C to Cython so I could take advantage of fused types and make it easier to call from python. 我已经重写了从C到Cython的算法,所以我可以利用融合类型,并使从python调用更容易。 The algorithm can take multiple arrays to work on along with some other parameters. 该算法可以使多个数组与其他一些参数一起工作。 The arrays are accepted as a pointer to pointers (ex. ). 数组被接受为指针指针(例如)。 I figured I would call the cython code from python by providing the multiple arrays as a tuple of numpy arrays, but to do it gets kind of messy with fused types. 我想我会通过提供多个数组作为numpy数组的元组来从python调用cython代码,但要做到这一点会使融合类型变得混乱。 Here's a simple example of how I have it working now: 这是我现在如何工作的一个简单示例:

import numpy
cimport numpy

ctypedef fused test_dtype:
    numpy.float32_t
    numpy.float64_t

cdef int do_stuff(test_dtype **some_arrays):
    if test_dtype is numpy.float32_t:
        return 1
    elif test_dtype is numpy.float64_t:
        return 2
    else:
        return -1

def call_do_stuff(tuple some_arrays):
    cdef unsigned int num_items = len(some_arrays)
    cdef void **the_pointer = <void **>malloc(num_items * sizeof(void *))
    if not the_pointer:
        raise MemoryError("Could not allocate memory")
    cdef unsigned int i
    cdef numpy.ndarray[numpy.float32_t, ndim=2] tmp_arr32
    cdef numpy.ndarray[numpy.float64_t, ndim=2] tmp_arr64
    if some_arrays[0].dtype == numpy.float32:
        for i in range(num_items):
            tmp_arr32 = some_arrays[i]
            the_pointer[i] = &tmp_arr32[0, 0]
        return do_stuff(<numpy.float32_t **>the_pointer)
    elif some_arrays[0].dtype == numpy.float64:
        for i in range(num_items):
            tmp_arr64 = some_arrays[i]
            the_pointer[i] = &tmp_arr64[0, 0]
        return do_stuff(<numpy.float64_t **>cols_pointer)
    else:
        raise ValueError("Array data type is unknown")

I realize that I can specify the type in the tuple, but nothing more complex than "object" if I understand it correctly. 我意识到我可以在元组中指定类型,但如果我理解正确的话,没有比“对象”更复杂的东西。 Does anyone know of a cleaner way of doing what I'm trying to do? 有谁知道我正在做的事情更清洁的方式? Any other cython tips are appreciated. 任何其他cython技巧表示赞赏。

There are other arguments passed including a fill_value argument of the same type as the array. 传递了其他参数,包括与数组相同类型的fill_value参数。 The code would get simpler if the test_dtype could be determined at call time via the arrays or the fill argument, but I can't find a good way to guarantee that C will receive the value in the correct type. 如果test_dtype可以在调用时通过数组或fill参数确定,代码会变test_dtype简单,但我找不到一个好的方法来保证C将以正确的类型接收值。 For example, passing numpy.nan or numpy.float64(numpy.nan) does not guarantee the data type. 例如,传递numpy.nannumpy.float64(numpy.nan)并不保证数据类型。

After programming Python and NumPy for 10 years (and C, C++, Matlab and Fortran 10 years before that), this is my general impression: 在Python和NumPy编程10年之后(以及C,C ++,Matlab和Fortran之前10年),这是我的总体印象:

It is often easier to write numerical code in C, C++ or Fortran than Cython. 使用C语言编写C,C ++或Fortran中的数字代码通常比使用Cython更容易。 The only exception I can think of is the smallest of code snipplets. 我能想到的唯一例外是最小的代码snipplet。 In C++ you have the luxury of using templates and the STL (and Boost if you like). 在C ++中,您可以使用模板和STL(如果您愿意,还可以使用Boost)。

Learn to use the NumPy C API. 学习使用NumPy C API。 The PyArrayObject (which is what a NumPy array is called in C) has a type number you can use for dispatch. PyArrayObject(在C中调用NumPy数组)具有可用于调度的类型编号。 You obtain it using the macro PyArray_TYPE() on your PyArrayObject*. 您可以使用PyArrayObject *上的宏PyArray_TYPE()获取它。 numpy.float64 maps to type number NPY_FLOAT64, numpy.float32 maps to type number NPY_FLOAT32, etc. Then you have corresponding C and C++ typedefs which you can use in your C or C++ code: If PyArray_TYPE(x) == NPY_FLOAT64, the data type to use in C or C++ is npy_float64. numpy.float64映射到类型号NPY_FLOAT64,numpy.float32映射到类型号NPY_FLOAT32等。然后你有相应的C和C ++ typedef,你可以在你的C或C ++代码中使用:如果PyArray_TYPE(x)== NPY_FLOAT64,数据要在C或C ++中使用的类型是npy_float64。 This way you can write C or C++ code which is totally defined by the NumPy arrays you pass in. 这样您就可以编写C或C ++代码,这些代码完全由您传入的NumPy数组定义。

I usually use a switch statement on PyArray_TYPE(x), and case with NPY_FLOAT64, NPY_FLOAT32, etc. For each case I call a templated C++ function with the correct template type. 我通常在PyArray_TYPE(x)上使用switch语句,在NPY_FLOAT64,NPY_FLOAT32等上使用case。对于每种情况,我都使用正确的模板类型调用模板化C ++函数。 This keeps the amount of code I need to write down to a minimum. 这样可以将我需要写入的代码量保持在最低限度。

http://docs.scipy.org/doc/numpy/reference/c-api.html http://docs.scipy.org/doc/numpy/reference/c-api.html

Cython is good for wrapping C and C++ and avoiding tedious Python C API coding, but here is a limit to how much you can statically type arguments. Cython适用于包装C和C ++并避免繁琐的Python C API编码,但这是对静态类型参数的限制。 For "down-to-the-iron" numerical code I think it is better to use plain C++, but Cython is an excellent tool for exposing it to Python. 对于“从头到尾”的数字代码,我认为使用普通的C ++会更好,但Cython是一个很好的工具,可以将它暴露给Python。 So write your numerical stuff in C++ and use Cython to call your C++. 所以用C ++编写你的数字内容并使用Cython来调用你的C ++。 That would be the best advice I can give Cython is an excellent tool for writing C extensions to Python, but it is not a replacement for C++ when C++ is what you really want. 这将是我能给出的最佳建议.Cython是编写Python扩展的优秀工具,但当C ++是您真正想要的时候,它不能替代C ++。

As for you question: The thing you want to do is not really possible. 至于你的问题:你想做的事情真的不可能。 Because in C or C++, which is what Cython emits, numpy.ndarray is PyArrayObject* regardless of dtype. 因为在C或C ++中,这是Cython发出的,numpy.ndarray是PyArrayObject *,无论dtype如何。 So you need to handcode the switch statement. 所以你需要手动编写switch语句。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM