Boost-Python: Expose a class to Python which is a subclass of a Python class (str)

Question

I am trying to have a Boost Python function return a Python class which is a subclass of a Python builtin class (here str ):

My first method involves creating the class in a Python module, mystr.py :

class MyStr(str):
    def __truediv__(self, other):
        return self + other

I then import that module using Boost, and then to return a python object of that type I use somehting along these lines in C++, importing the module and calling py::exec :

py::object AsMyStr(std::string const &s)
{
    py::object my_str = py::import("mystr");
    py::dict my_namespace(my_str.attr("__dict__"));
    
    my_namespace["_MYSTR_test"] = s;
    py::exec(
        "_MYSTR_test = MyStr(_MYSTR_test)\n",
        my_namespace, my_namespace);
    return my_namespace["_MYSTR_test"];
}

Exposing this funtion in a Boost-Python module, this correctly gives me a MyStr instance on the Python side, which can be used accordingly:

 a = AsMyStr("Hello")
 b = " World"
 print(a / b)
 # "Hello World"

I just wonder if the subclassing of str can be done on the Boost-Python side of things in C++. I cannot manage to get __truediv__ to work in that case:

class MyStr : public py::str
{
public:
    MyStr(py::object const &o) : py::str(o)

    MyStr __truediv__(other)
    {
         return MyStr(*this + other);
    }
 }

Exposing it as a module

 BOOST_PYTHON_MODULE(MyStr)
 {
     py::class_<MyStr, py::bases<py::str>>("MyStr", py::no_init)
         .def(py::init<py::object const &>())
         .def("__truediv__", &MyStr::__truediv__)
         ;
 }

But using this class on the Python side leads to:

 a = MyStr("Hello")
 b = " World"
 print(a / b)
 # ValueError: character U+5555eaa0 is not in range [U+0000; U+10ffff]

How do I have to define and expose the class MyStr in the C++ implementation to return on the Python side a "true" MyStr which is a subclass of str ?

I uploaded the code to https://gitlab.com/kohlrabi/learn-boost-python , the branch master contains the first solution, the branch cpp_class the second, non-working solution.

Answer 1

The range U+0000 to U+10fffff represents all possible Unicode code points .

Your string is likely to be encoded between C++ and Python, so you can try an encoding as in .decode('cp1252') . Or you can do .decode('utf-8', 'surrogatepass') and the bad characters will show as undecoded bytes in the resulting string.

Change surrogatepass to replace and they become question marks and change to ignore and they disappear.

Boost-Python: Expose a class to Python which is a subclass of a Python class (str)

Question

1 answers

solution1
0 2022-08-12 09:55:18

Boost-Python: Expose a class to Python which is a subclass of a Python class (str)

Question

1 answers

solution1 0 2022-08-12 09:55:18

solution1
0 2022-08-12 09:55:18