I am trying to have a Boost Python function return a Python class which is a subclass of a Python builtin class (here str
):
My first method involves creating the class in a Python module, mystr.py
:
class MyStr(str):
def __truediv__(self, other):
return self + other
I then import that module using Boost, and then to return a python object of that type I use somehting along these lines in C++, importing the module and calling py::exec
:
py::object AsMyStr(std::string const &s)
{
py::object my_str = py::import("mystr");
py::dict my_namespace(my_str.attr("__dict__"));
my_namespace["_MYSTR_test"] = s;
py::exec(
"_MYSTR_test = MyStr(_MYSTR_test)\n",
my_namespace, my_namespace);
return my_namespace["_MYSTR_test"];
}
Exposing this funtion in a Boost-Python module, this correctly gives me a MyStr
instance on the Python side, which can be used accordingly:
a = AsMyStr("Hello")
b = " World"
print(a / b)
# "Hello World"
I just wonder if the subclassing of str
can be done on the Boost-Python side of things in C++. I cannot manage to get __truediv__
to work in that case:
class MyStr : public py::str
{
public:
MyStr(py::object const &o) : py::str(o)
MyStr __truediv__(other)
{
return MyStr(*this + other);
}
}
Exposing it as a module
BOOST_PYTHON_MODULE(MyStr)
{
py::class_<MyStr, py::bases<py::str>>("MyStr", py::no_init)
.def(py::init<py::object const &>())
.def("__truediv__", &MyStr::__truediv__)
;
}
But using this class on the Python side leads to:
a = MyStr("Hello")
b = " World"
print(a / b)
# ValueError: character U+5555eaa0 is not in range [U+0000; U+10ffff]
How do I have to define and expose the class MyStr
in the C++ implementation to return on the Python side a "true" MyStr which is a subclass of str
?
I uploaded the code to https://gitlab.com/kohlrabi/learn-boost-python , the branch master
contains the first solution, the branch cpp_class
the second, non-working solution.
The range U+0000 to U+10fffff represents all possible Unicode code points .
Your string is likely to be encoded between C++ and Python, so you can try an encoding as in .decode('cp1252')
. Or you can do .decode('utf-8', 'surrogatepass')
and the bad characters will show as undecoded bytes in the resulting string.
Change surrogatepass
to replace
and they become question marks and change to ignore and they disappear.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.