cython中的函数更改numpy数组类型

Question

I am working with Cython and numpy, and have a strange issue to do with a cython function changing the dtype of the elements of a numpy array. 我正在使用Cython和numpy，但是与更改numpy数组元素的dtype的cython函数有关的问题很奇怪。 Strangely, the dtype is only changed when the input type of the array is actually specified. 奇怪的是，仅在实际指定数组的输入类型时才更改dtype。

I am using Cython==0.29.11, numpy==1.15.4, python 3.6, on Ubuntu 18.04. 我在Ubuntu 18.04上使用Cython == 0.29.11，numpy == 1.15.4，python 3.6。

# cyth.pyx
cimport numpy as np

def test(x):
    print(type(x[0]))

def test_np(np.ndarray[np.uint32_t, ndim=1] x):
    print(type(x[0]))

Now cythonising this file and using the functions: 现在对这个文件进行cythonize并使用以下功能：

>>> from cyth import test, test_np
>>> import numpy as np
>>> a = np.array([1, 2], dtype=np.uint32)
>>> test(a)
<class 'numpy.uint32'>
>>> test_np(a)
<class 'int'>

So test works as expected, printing the type of the first element in the input array - a uint32. 因此test按预期方式工作，打印输入数组中第一个元素的类型-uint32。 But test_np , which actually ensures that the type of the incoming array is uint32, now shows a regular Python int as the type of the first element. 但是test_np实际上确保了传入数组的类型为uint32，现在显示了常规Python int作为第一个元素的类型。

Even trying to force the element to to be of the right type does not work, ie using: 即使试图强制元素为正确的类型也不起作用，即使用：

def test_np(np.ndarray[np.uint32_t, ndim=1] x):
    cdef np.uint32_t el
    el = x[0]
    print(type(el))

still results in 仍然导致

>>> test_np(a)
<class 'int'>

Any help in understanding this discrepancy would be greatly appreciated. 在理解这一差异方面的任何帮助将不胜感激。

Answer 1

Cython doesn't change the type of the array, but returns an element of a slightly different type. Cython不会更改数组的类型，但是会返回类型稍有不同的元素。

The data in numpy-array is stored as contiguous field of 32bit unsigned integers. numpy-array中的数据存储为32位无符号整数的连续字段。 Accessing x[0] means creating a Python-object (because Python interpreter cannot handle raw C-ints) - numpy has a dedicated wrapper class for every numpy-dtype and returns an np.uint32 -object. 访问x[0]意味着创建一个Python对象（因为Python解释器无法处理原始C-int） np.uint32对于每个numpy- np.uint32都有专用的包装器类，并返回一个np.uint32 。

Cython on the other hand, maps all C integer types (eg long , int and so on) simple onto Python-integer (which make sense). 另一方面，Cython将所有简单的C整数类型（例如long ， int等）映射到Python整数（有意义）上。

Now, when numpy is cimported, x[0] no longer means using __getitem__() of the numpy-array (which would return np.uint32 -object) but a C-integer (in this case unsigned 4byte), which is converted to a Python-integer, because "return XXX" means in a def function means the result must be a Python-object. 现在，当导入numpy时， x[0]不再意味着使用numpy数组的__getitem__() （它将返回np.uint32 ），而是C整数（在这种情况下为unsigned 4byte），它将转换为Python整数，因为“ return XXX”表示在def函数中表示结果必须是Python对象。

Which does mean, that the array has a different type - the types are mapped differently when converted to Python-object by Cython. 这确实意味着该数组具有不同的类型-Cython将这些类型映射为Python对象时，它们的映射方式不同。

If you want to access data as np.uint32 -objects, you could call __getitem__ instead of [..] ( [..] is translated by Cython as access to raw-C-data): 如果要以np.uint32 -objects的形式访问数据，则可以调用__getitem__而不是[..] （Cython将[..]转换为对原始C数据的访问权）：

%%cython
cimport numpy as np

def test_np(np.ndarray[np.uint32_t, ndim=1] x):
    print(type(x[0]))                     # int
    print(type(x.__getitem__(0)))         # numpy.uint32

When you use typed memory views rather than ndarray, then calling __getitem__ directly will return a Python-integer __getitem__ of the memory view doesn't call __getitem__ of the underlying ndarray but accesses the data on the C-level. 当您使用类型化的内存视图而不是ndarray时，直接调用__getitem__会返回内存视图的Python整数__getitem__而不调用基础ndarray的__getitem__而是在C级访问数据。 To call __getitem__ of the underlying object for memory view: 调用底层对象的__getitem__进行内存视图：

def test_np(np.uint32_t[:] x):
    print(type(x[0]))
    print(type(x.base.__getitem__(0))) # instead of x.__getitem__(0)

cython中的函数更改numpy数组类型

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-07-02 09:50:53

cython中的函数更改numpy数组类型

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-07-02 09:50:53

解决方案1
2 已采纳 2019-07-02 09:50:53