简体   繁体   English

通过 pybind11 返回 numpy 数组

[英]returning numpy arrays via pybind11

I have a C++ function computing a large tensor which I would like to return to Python as a NumPy array via pybind11 .我有一个 C++ 函数计算一个大张量,我想通过pybind11将它作为 NumPy 数组返回给 Python。

From the documentation of pybind11, it seems like using STL unique_ptr is desirable.从 pybind11 的文档来看,使用STL unique_ptr似乎是可取的。 In the following example, the commented out version works, whereas the given one compiles but fails at runtime ("Unable to convert function return value to a Python type!").在下面的示例中,注释掉的版本有效,而给定的版本可以编译但在运行时失败(“无法将函数返回值转换为 Python 类型!”)。

Why is the smartpointer version failing?为什么智能指针版本失败? What is the canonical way to create and return a NumPy array?创建和返回 NumPy 数组的规范方法是什么?

PS: Due to program structure and size of the array, it is desirable to not copy memory but create the array from a given pointer. PS:由于程序结构和数组的大小,最好不要复制内存而是从给定的指针创建数组。 Memory ownership should be taken by Python.内存所有权应该由 Python 取得。

typedef typename py::array_t<double, py::array::c_style | py::array::forcecast> py_cdarray_t;

// py_cd_array_t _test()
std::unique_ptr<py_cdarray_t> _test()
{
    double * memory = new double[3]; memory[0] = 11; memory[1] = 12; memory[2] = 13;
    py::buffer_info bufinfo (
        memory,                                   // pointer to memory buffer
        sizeof(double),                           // size of underlying scalar type
        py::format_descriptor<double>::format(),  // python struct-style format descriptor
        1,                                        // number of dimensions
        { 3 },                                    // buffer dimensions
        { sizeof(double) }                        // strides (in bytes) for each index
    );

    //return py_cdarray_t(bufinfo);
    return std::unique_ptr<py_cdarray_t>( new py_cdarray_t(bufinfo) );
}

A few comments (then a working implementation). 一些评论(然后是一个有效的实施)。

  • pybind11's C++ object wrappers around Python types (like pybind11::object , pybind11::list , and, in this case, pybind11::array_t<T> ) are really just wrappers around an underlying Python object pointer. pybind11围绕Python类型的C ++对象包装器(如pybind11::objectpybind11::list ,在本例中为pybind11::array_t<T> )实际上只是底层Python对象指针的包装器。 In this respect there are already taking on the role of a shared pointer wrapper, and so there's no point in wrapping that in a unique_ptr : returning the py::array_t<T> object directly is already essentially just returning a glorified pointer. 在这方面,已经有了共享指针包装器的作用,因此在unique_ptr中包装它没有意义:直接返回py::array_t<T>对象本质上只是返回一个美化指针。
  • pybind11::array_t can be constructed directly from a data pointer, so you can skip the py::buffer_info intermediate step and just give the shape and strides directly to the pybind11::array_t constructor. pybind11::array_t可以直接从数据指针构造,因此你可以跳过py::buffer_info中间步骤,只需给出形状并直接pybind11::array_tpybind11::array_t构造函数。 A numpy array constructed this way won't own its own data, it'll just reference it (that is, the numpy owndata flag will be set to false). 以这种方式构造的numpy数组不会拥有自己的数据,它只会引用它(也就是说,numpy owndata标志将被设置为false)。
  • Memory ownership can be tied to the life of a Python object, but you're still on the hook for doing the deallocation properly. 内存所有权可以与Python对象的生命相关联,但是您仍然可以正确地进行释放。 Pybind11 provides a py::capsule class to help you do exactly this. Pybind11提供了一个py::capsule类来帮助你做到这一点。 What you want to do is make the numpy array depend on this capsule as its parent class by specifying it as the base argument to array_t . 你想要做的是通过将numpy数组指定为array_tbase参数,使numpy数组依赖于此数据包作为其父类。 That will make the numpy array reference it, keeping it alive as long as the array itself is alive, and invoke the cleanup function when it is no longer referenced. 这将使numpy数组引用它,只要数组本身处于活动状态就保持活动状态,并在不再引用它时调用清理函数。
  • The c_style flag in the older (pre-2.2) releases only had an effect on new arrays, ie when not passing a value pointer. 旧版(2.2之前版本)中的c_style标志仅对新数组产生影响,即未传递值指针时。 That was fixed in the 2.2 release to also affect the automatic strides if you specify only shapes but not strides. 如果您仅指定形状但不指定步幅,则在2.2版本中修复此选项也会影响自动步幅。 It has no effect at all if you specify the strides directly yourself (as I do in the example below). 如果您自己直接指定步幅,则它完全没有效果(就像我在下面的示例中所做的那样)。

So, putting the pieces together, this code is a complete pybind11 module that demonstrates how you can accomplish what you're looking for (and includes some C++ output to demonstrate that is indeed working correctly): 因此,将这些部分组合在一起,这段代码就是一个完整的pybind11模块,它演示了如何实现您正在寻找的东西(并包含一些C ++输出以证明它确实正常工作):

#include <iostream>
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>

namespace py = pybind11;

PYBIND11_PLUGIN(numpywrap) {
    py::module m("numpywrap");
    m.def("f", []() {
        // Allocate and initialize some data; make this big so
        // we can see the impact on the process memory use:
        constexpr size_t size = 100*1000*1000;
        double *foo = new double[size];
        for (size_t i = 0; i < size; i++) {
            foo[i] = (double) i;
        }

        // Create a Python object that will free the allocated
        // memory when destroyed:
        py::capsule free_when_done(foo, [](void *f) {
            double *foo = reinterpret_cast<double *>(f);
            std::cerr << "Element [0] = " << foo[0] << "\n";
            std::cerr << "freeing memory @ " << f << "\n";
            delete[] foo;
        });

        return py::array_t<double>(
            {100, 1000, 1000}, // shape
            {1000*1000*8, 1000*8, 8}, // C-style contiguous strides for double
            foo, // the data pointer
            free_when_done); // numpy array references this parent
    });
    return m.ptr();
}

Compiling that and invoking it from Python shows it working: 编译并从Python调用它表明它工作:

>>> import numpywrap
>>> z = numpywrap.f()
>>> # the python process is now taking up a bit more than 800MB memory
>>> z[1,1,1]
1001001.0
>>> z[0,0,100]
100.0
>>> z[99,999,999]
99999999.0
>>> z[0,0,0] = 3.141592
>>> del z
Element [0] = 3.14159
freeing memory @ 0x7fd769f12010
>>> # python process memory size has dropped back down

I recommend using ndarray .我建议使用ndarray A foundational principle is that the underlying data is never copied unless explicitly requested (or you quickly end up with huge inefficiencies).一个基本原则是除非明确要求(否则很快就会导致效率低下),否则永远不会复制底层数据。 Below is an example of it in use, but there are other features I haven't shown, including conversion to Eigen arrays ( ndarray::asEigen(array) ), which makes it pretty powerful.下面是它的使用示例,但还有其他功能我没有展示,包括转换为特征数组( ndarray::asEigen(array) ),这使得它非常强大。

Header:标题:

#ifndef MYTENSORCODE_H
#define MYTENSORCODE_H

#include "ndarray_fwd.h"

namespace myTensorNamespace {

ndarray::Array<double, 2, 1> myTensorFunction(int param1, double param2);

}  // namespace myTensorNamespace

#endif  // include guard

Lib:图书馆:

#include "ndarray.h"
#include "myTensorCode.h"

namespace myTensorNamespace {

ndarray::Array<double, 2, 1> myTensorFunction(int param1, double param2) {
    std::size_t const size = calculateSize();
    ndarray::Array<double, 2, 1> array = ndarray::allocate(size, size);
    array.deep() = 0;  // initialise
    for (std::size_t ii = 0; ii < size; ++ii) {
        array[ii][ndarray::view(ii, ii + 1)] = 1.0;
    }
    return array;
}

}  // namespace myTensorNamespace

Wrapper:包装:

#include "pybind11/pybind11.h"
#include "ndarray.h"
#include "ndarray/pybind11.h"

#include "myTensorCode.h"

namespace py = pybind11;
using namespace pybind11::literals;

namespace myTensorNamespace {
namespace {

PYBIND11_MODULE(myTensorModule, mod) {
    mod.def("myTensorFunction", &myTensorFunction, "param1"_a, "param2"_a);
}

}  // anonymous namespace
}  // namespace myTensorNamespace

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM