简体   繁体   中英

Creating Python binding to cppcoro generator

I am trying to create a class in C++ with a generator method to run in Python, and I need the method to return lists of values. For coroutines I am using a maintained fork of cppcoro .

Here's an example:

#include <vector>
#include <experimental/random>
#include <cppcoro/generator.hpp>

class RandomVectorGenerator{
    int Range;
    int Limit;
public:
    RandomVectorGenerator(int range, int limit): Range(range), Limit(limit){}

    cppcoro::generator<std::vector<int>> get_random_vector(){
        for(int i = 0; i < Limit; i++) {

            int random_lenght = std::experimental::randint(0, Range);
            std::vector<int> random_vector;

            for (int i = 0; i < random_lenght; i++) {
                uint32_t random_value = std::experimental::randint(0, Range);
                random_vector.push_back(random_value);
            }
            co_yield random_vector;
        }
        co_return;
    }

};

Given Range and Limit , this class can generate up to Limit integer vectors, with 0 to Range values from 0 to Range .

Using it in C++ as follows:

int main() {
    RandomVectorGenerator generator = RandomVectorGenerator(5, 5);
    auto gen = generator.get_random_vector();
    auto iter = gen.begin();
    while (true) {
        std::vector<int> solution = *iter;
        for (int j = 0; j < solution.size(); j++) {
            std::cout << solution[j] << " ";
        }
        std::cout << std::endl;
        ++iter;
        if (iter == gen.end()) break;
    }
    return 0;
}

As expected I might get an output as such

2 2 4 1 
0 5 2 

0 
2 4 

If I bind the class and its' methods to python as follows:

#include <pybind11/stl.h>
#include <pybind11/pybind11.h>
namespace py = pybind11;

PYBIND11_MODULE(random_vectors, m) {
    py::class_<RandomVectorGenerator>(m, "random_vectors")
    .def(py::init<int, int>())
    .def("__iter__", [](RandomVectorGenerator &generator) { 
        auto gen = generator.get_random_vector(); 
        return py::make_iterator(gen.begin(), gen.end()); 
        },
        py::keep_alive<0, 1>());
};

This binding compiles and creates an importable module. However, when I proceed to use the iterator,

from random_vectors import random_vectors

generator = random_vectors(5, 5)
iterator = iter(generator)

print(next(iterator))

Running the code above in a fresh kernel causes next(iterator) to raise StopIteration .

Runnning it after the first time gives output. The output lenght is of the expected range, but the values are all over the place, for example [1661572905, 5, 1514791955, -1577772014]

further more if I call next(iterator) again, the kernel silently crashes.

I can reproduce the behaviour on C++ side by modifying int main() as such:

int main() {
    RandomVectorGenerator generator = RandomVectorGenerator(5, 5);
    auto iter = generator.get_random_vector().begin();             //Here's a change
    while (true) {
        std::vector<int> solution = *iter;
        for (int j = 0; j < solution.size(); j++) {
            std::cout << solution[j] << " ";
        }
        std::cout << std::endl;
        ++iter;
        if (iter == generator.get_random_vector().end()) break;    //Also here
    }
    return 0;
}

This gives the same output as in python, but does not crash silently, it happens right at ++iter , and the message is Segmentation fault (core dumped)

My guess is that the issue with the binding is that the gen object in the binding is created temporarily and does not remain after creation of the iterator. I tried changing the py::keep_alive arguments, but to no avail.

I am convinced that for this to work, the begin() and end() methdods have to be part of the whole class, just like it is in the pybind11 examples on iterators, but I can not define them just like in the examples, because the generator method has to first be initialized.

Thus my conclusion is that RandomVectorGenerator has to be derived from the cppcoro::generator, if that is right, how would I go about this?

First up, the solution I got earlier ended up being broken. I am still going to describe it at the end of this answer for reference. It behaved in an interesting manner, very rarely, under seemingly normal conditions, it worked.

I had to resort to a workaround, the resulting python class does not have an __iter__ method, but it's good enough because I can then create a children class on python side after compilation.

Solution is as follows:

class RandomVectorGenerator{
    int Range;
    int Limit;
    cppcoro::generator<std::vector<int>> generator;
    cppcoro::detail::generator_iterator<std::vector<int> > iter;
    cppcoro::detail::generator_sentinel end;
public:
    RandomVectorGenerator(int range, int limit): Range(range), Limit(limit){}

    void make_generator(){
        generator = this->get_random_vector();
        iter = generator.begin();
        end = generator.end();
    }

    std::vector<int> get_next(){
        if (iter == end) throw py::stop_iteration("end reached");
        std::vector<int> vect = *iter;
        ++iter;
        return vect;
    }

    cppcoro::generator<std::vector<int>> get_random_vector(){
        for(int i = 0; i < Limit; i++) {
            int random_lenght = std::experimental::randint(0, Range);
            std::vector<int> random_vector;
            for (int i = 0; i < random_lenght; i++) {
                uint32_t random_value = std::experimental::randint(0, Range);
                random_vector.push_back(random_value);
            }
            co_yield random_vector;
        }
        co_return;
    }
};

What I've done is created a method which creates an iterator internally from the cppcoro generator ( make_generator ), which is then wrapped in an equivalent of __next__ ( get_next ).

This class is then bound to python as such:

PYBIND11_MODULE(random_vectors, m) {
    py::class_<RandomVectorGenerator>(m, "random_vectors")
    .def(py::init<int, int>())
    .def("make_generator", &RandomVectorGenerator::make_generator)
    .def("__next__", &RandomVectorGenerator::get_next);
};

Given that I am going to create a child class in python afterwards, binding get_next to __next__ isn't strictly necessary, however, it felt cute . Note that nowhere do I have to release the GIL.

On Python side, I then utilize this bound class like so:

from random_vectors import random_vectors

class generator(random_vectors):
    def __init__(self, Range, Limit):
        super().__init__(Range, Limit)
    
    def __iter__(self):
        self.make_generator()
        return self

for i in generator(5,5):
    print(i)

It works as expected, the solution just ends up somewhat hacky. However, the result is that I can now utilize C++ coroutines in python. Plus I was able to make this work for my more complex use case (Dancing Links), however it crashes for larger problems, so the quest is not complete, but it's something.


As for the broken solution. It was much more elegant that what I have above.

This was the solution:

class RandomVectorGenerator{
    int Range;
    int Limit;
public:

    RandomVectorGenerator(int range, int limit): Range(range), Limit(limit){
        generator = this->get_random_vector();}

    cppcoro::generator<std::vector<int>> generator;

    cppcoro::generator<std::vector<int>> get_random_vector(){
        for(int i = 0; i < Limit; i++) {
            int random_lenght = std::experimental::randint(0, Range);
            std::vector<int> random_vector;
            for (int i = 0; i < random_lenght; i++) {
                uint32_t random_value = std::experimental::randint(0, Range);
                random_vector.push_back(random_value);
            }
            co_yield random_vector;
        }
        co_return;
    }
};

It is quite similar to the one above, but rather than creating a __next__ I tried to create an __iter__ with pybind11 .

Binding was done like so:

PYBIND11_MODULE(random_vectors, m) {
    py::class_<RandomVectorGenerator>(m, "random_vectors")
    .def(py::init<int, int>())
    .def("__iter__", [](RandomVectorGenerator &generator) {py::gil_scoped_release release;
        return py::make_iterator(generator.generator.begin(), generator.generator.end());
        },
         py::keep_alive<0, 1>());
};

Code above compiled and was used in python like so:

from random_vectors import random_vectors

rv = random_vectors(15,10)

iterator = iter(rv)

while True:
    try:
        print(next(iterator))
    except StopIteration:
        break

Running it from terminal with -X dev resulted in segmentation error, line 7 being iterator = iter(rv) :

Fatal Python error: Segmentation fault

Current thread 0x00007f8c889f1740 (most recent call first):
  File "path-to-scrip/test.py", line 7 in <module>

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

This happened every time when it was executed from terminal, however, very rarely running it in Spyder 5.1.5 on Python 3.9.12 it worked and provided the expected output. The way this happened was very inconsistent, but my observation was this:

  1. Try running it once in a fresh console
  2. It fails and restarts kernel
  3. Restart kernel manually
  4. Very rarely it started working in said console

Another observation was that the process of trial and error caused temporary freezing and unresponsiveness. I also would like to note that it never worked in my actual application, which is much more complex.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM