简体   繁体   English

将std :: vector转换为NumPy数组而不复制数据

[英]Convert a std::vector to a NumPy array without copying data

I have a C++ library which currently has some methods inside which return a std::vector defined like 我有一个C ++库,目前有一些方法可以返回定义的std::vector

public:
  const std::vector<uint32_t>& getValues() const;

I'm currently working on wrapping the whole library for Python using SWIG and this is working well so far. 我目前正在使用SWIG为Python包装整个库,到目前为止这种方法运行良好。

SWIG wraps this getValues() function fine such that it returns a Python tuple. SWIG将此getValues()函数包装好,以便返回Python元组。 The issue is in my Python-side code I want to convert this to a NumPy array. 问题出在我的Python端代码中,我希望将其转换为NumPy数组。 Of course I can do this by: 我当然可以这样做:

my_array = np.array(my_object.getValues(), dtype='uint32')

but this causes all the entries in the original vector to be first copied into a Python tuple by SWIG and then again into a numpy array by me. 但是这会导致原始向量中的所有条目首先被SWIG复制到Python元组中,然后由我再次复制到numpy数组中。 Since this vector could be very large, I'd rather avoid making these two copies and would like for a way to have SWIG create a numpy.array wrapper around the original vector data in memory. 由于这个向量可能非常大,我宁愿避免制作这两个副本,并希望有一种方法让SWIG在内存中的原始向量数据周围创建一个numpy.array包装器。

I've read the documentation for numpy.i but that explicitly mentions that output arrays are not supported since they seem to be working under the assumption of C-style arrays rather than C++ vectors. 我已经阅读了numpy.i的文档,但明确提到输出数组不受支持,因为它们似乎是在C样式数组而不是C ++向量的假设下工作的。

numpy.array's underlying data structure is just a C-style array as is a C++ std::vector so I would hope that it is feasible to have then access the same data in memory. numpy.array的底层数据结构只是一个C风格的数组,就像C ++ std :: vector一样,所以我希望在内存中访问相同的数据是可行的。

Is there any way to make SWIG return a numpy.array which doesn't copy the original data? 有没有办法让SWIG返回一个不复制原始数据的numpy.array?

Apparently it is trivial to "cast" a C++ vector to (C) array, see answer on this question: How to convert vector to array in C++ 显然,将C ++向量“转换”为(C)数组是微不足道的,请参阅这个问题的答案: 如何在C ++中将向量转换为数组

Next you can create a numpy array which will use that C array without copying, see discussion here , or google for PyArray_SimpleNewFromData . 接下来,您可以创建一个numpy数组,该数组将使用该C数组而不进行复制,请参阅此处的讨论 ,或google获取PyArray_SimpleNewFromData

I wouldn't expect SWIG to do all these for you automatically, instead you should probably write a wrapper for your function getValues yourself, something like getValuesAsNumPyArray . 我不希望SWIG自动为你做所有这些,而你应该自己为你的函数getValues写一个包装器,比如getValuesAsNumPyArray

It seems like PyArray_SimpleNewFromData would require you to do your own memory management; 好像PyArray_SimpleNewFromData会要求你做自己的内存管理; if memory management is already handled on the C++ side, that is, Python is not responsible for the memory, you can just use np.asarray to get a numpy array that shares memory with the C++ vector, like so: 如果已经在C ++端处理了内存管理,也就是说,Python不负责内存,你可以使用np.asarray来获得一个与C ++向量共享内存的numpy数组,如下所示:

from libcpp.vector cimport vector
import numpy as np
cdef vector[double] vec
vec.push_back(1)
vec.push_back(2)
cdef double *vec_ptr = &vec[0]    # get hold of data array underlying vec; also vec.data() if you have C++11
cdef double[::1] vec_view = <double[:vec.size()]>vec_ptr    # cast to typed memory view
vec_npr = np.asarray(vec_view)    # get numpy array from memory view
print(vec_npr)    # array([1.0, 2.0])

The "Wrapping C and C++ Arrays" section in chapter 10 of Kurt Smith's Cython book provides good examples on this. Kurt Smith的Cython书第10章中的“包装C和C ++数组”部分提供了很好的例子。 Also see Coercion to Numpy from official user guide . 另请参阅官方用户指南中的Coercion to Numpy

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM