Can you create a python class via pybind11?

Question

Currently when working with python + pybind11 I find it frustrating to work with the typed c++ classes/structs.

I would like to change my bindings so that they generate a simple python class, with an __init__ and a simple function like shown below. Is something like the feasible?

Reasoning:
I currently have a struct that I generate via c++, but it has a lot of heavy std::vector<float> s that I would like to pass to python, and keep as numpy arrays inside a similar interfacing python class. (bonus points if you can tell me how to move vectors to be numpy arrays quickly!)

I have already completely bound my c++ struct with pybind11, so I feel like I know what I'm doing... however I can't seem to figure out if this is possible!

So, as a learning exercise, can I make the following python class via pybind11?

>>> python
class MyStruct:
    def __init__(self, A_in, descriptor_in):
        self.A = A_in
        self.descriptor = descriptor_in

    def add_to_vec(f_in):
        self.A.append(f_in)
<<< python

Edit: I want to say I 'think' that this is doable with the python C api, but I'd like to avoid using that directly if I can. (but if you think that's the only way, please let me know :) )

Edit2: (response to @Erwan)
The only way I'm aware of to get class variables individually is this (shown below). You cannot use the pybind advertised buffer_protocol interface if you have more than one numpy array in the struct you would like to get. However this requires creating a python-interface only function .def (not ideal) that points to (what I think is a copy) of the original data (so it's probably slow, i haven't benchmarked it, but I'm not sure if this was is a hack or the correct way to get vectors into numpy arrays).

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <vector>
#include <string>


struct Pet {
    Pet(const std::string &name) : name(name) { 
            bdata.push_back(22.);
            bdata.push_back(23.1);
            bdata.push_back(24.);
            bdata.push_back(2222.);
        }
    void setName(const std::string &name_) { name = name_; }
    const std::string &getName() const { return name; }

    std::string name;
    std::vector<float> bdata;
};


namespace py = pybind11;

PYBIND11_MODULE(example, m) {
    py::class_<Pet>(m, "Pet")
            .def(py::init<const std::string &>())
            .def("setName", &Pet::setName)
            .def("getName", &Pet::getName)
            .def("bdata", [](Pet &m) -> py::array {
                    py::buffer_info buff_info(py::buffer_info(
                            m.bdata.data(),                               /* Pointer to buffer */
                            sizeof(float),                          /* Size of one scalar */
                            py::format_descriptor<float>::format(), /* Python struct-style format descriptor */
                            m.bdata.size()                                      /* Number of dimensions */
                    ));
                    return py::array(buff_info);
            });
}

Answer 1

I don't understand your question in whole, but I'll take this part:

bonus points if you can tell me how to move vectors to be numpy arrays quickly!

If you use the return result of bdata.data() combined with numpy.frombuffer() and bdata.size() if need be, you can get a view on the vector data, which is guaranteed to be contiguous as of C++11. (The normal numpy.array() call will not honor copy=False in this case, but frombuffer acts like a cast.) Since there is no copy, that's probably as quick as it gets.

Below is an example in cppyy (which allows for easy testing, but the use of which is otherwise immaterial to the answer of how to mix std::vector and numpy.array per se). The gravy is in the last few lines: the update to 'arr' will show up in the original vector (and vv) b/c frombuffer is a view, not a copy:

import cppyy
import numpy as np

# load struct definition
cppyy.cppdef("""
struct Pet {
    Pet(const std::string &name) : name(name) {
        bdata.push_back(22.);
        bdata.push_back(23.1);
        bdata.push_back(24.);
        bdata.push_back(2222.);
    }
    void setName(const std::string &name_) { name = name_; }
    const std::string &getName() const { return name; }

    std::string name;
    std::vector<float> bdata;
};""")

# create a pet object
p = cppyy.gbl.Pet('fido')

print(p.bdata[0]) # for reference (prints 22, per above)

# create a numpy view on the std::vector's data
#   add count=p.bdata.size() if need be
arr = np.frombuffer(p.bdata.data(), dtype=np.float32)

# prove that it worked as intended
arr[0] = 14
print(p.bdata[0]) # shows update to 14
p.bdata[2] = 17.5
print(arr[2])     # shows update to 17.5

which will print:

22.0
14.0
17.5

'arr' may become invalid if the std::vector resizes. If you know the maximum size, however, and it is not too large or will be fully used for sure, you can reserve that, so the vector's internal data will not be reallocated.

Depending on how/where you store the numpy array, I also recommend tying the life time of 'p' (and hence 'p.bdata') to 'arr' eg. by keeping them both as data members in an instance of the wrapper class you're after.

If you want to do the conversion in C++ instead, use PyArray_FromBuffer from NumPy's array API.

Can you create a python class via pybind11?

Question

1 answers

solution1
0 2019-09-11 06:00:26

Can you create a python class via pybind11?

Question

1 answers

solution1 0 2019-09-11 06:00:26

solution1
0 2019-09-11 06:00:26