简体   繁体   中英

Python binding C++ virtual member function cannot be called

I recently wrote an extension to Python 3 in C++, but I encountered some trouble when I called C++ in python, and I don't plan to use a third-party library.

I'm used Python binding C++ virtual member function cannot be called, but removing the virtual keyword is all right.

It crashed when it ran to return PyObject_CallObject(pFunction, args); , but I didn't find the reason.

Here is my code:

class A 
{
    PyObject_HEAD
public:
    A()
    {
        std::cout << "A::A()" << std::endl;
    }

    ~A()
    {
        std::cout << "A::~A()" << std::endl;
    }

    virtual void test()
    {
        std::cout << "A::test()" << std::endl;
    }
};

class B : public A
{
public:
    B()
    {
        std::cout << "B::B()" << std::endl;
    }

    ~B()
    {
        std::cout << "B::~B()" << std::endl;
    }

    static PyObject *py(B *self) {
        self->test();
        return PyLong_FromLong((long)123456);
    }
};

static void B_dealloc(B *self) 
{
    self->~B();
    Py_TYPE(self)->tp_free((PyObject *)self);
}

static PyObject *B_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    B *self = (B*)type->tp_alloc(type, 0);
    new (self)B;
    return (PyObject*)self;
}

static PyMethodDef B_methods[] = {
    {"test", (PyCFunction)(B::py), METH_NOARGS, nullptr},
    {nullptr}
};

static struct PyModuleDef example_definition = {
    PyModuleDef_HEAD_INIT,
    "example",
    "example",
    -1,
    B_methods
};

static PyTypeObject ClassyType = {
    PyVarObject_HEAD_INIT(NULL, 0) "example.B", /* tp_name */
    sizeof(B),                                  /* tp_basicsize */
    0,                                          /* tp_itemsize */
    (destructor)B_dealloc,                      /* tp_dealloc */
    0,                                          /* tp_print */
    0,                                          /* tp_getattr */
    0,                                          /* tp_setattr */
    0,                                          /* tp_reserved */
    0,                                          /* tp_repr */
    0,                                          /* tp_as_number */
    0,                                          /* tp_as_sequence */
    0,                                          /* tp_as_mapping */
    0,                                          /* tp_hash  */
    0,                                          /* tp_call */
    0,                                          /* tp_str */
    0,                                          /* tp_getattro */
    0,                                          /* tp_setattro */
    0,                                          /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,   /* tp_flags */
    "B objects",                                /* tp_doc */
    0,                                          /* tp_traverse */
    0,                                          /* tp_clear */
    0,                                          /* tp_richcompare */
    0,                                          /* tp_weaklistoffset */
    0,                                          /* tp_iter */
    0,                                          /* tp_iternext */
    B_methods,                                  /* tp_methods */
    nullptr,                                    /* tp_members */
    0,                                          /* tp_getset */
    0,                                          /* tp_base */
    0,                                          /* tp_dict */
    0,                                          /* tp_descr_get */
    0,                                          /* tp_descr_set */
    0,                                          /* tp_dictoffset */
    nullptr,                                    /* tp_init */
    0,                                          /* tp_alloc */
    B_new,                                      /* tp_new */
};

PyMODINIT_FUNC PyInit_example(void)
{

    PyObject *m = PyModule_Create(&example_definition);

    if (PyType_Ready(&ClassyType) < 0)
        return NULL;

    Py_INCREF(&ClassyType);
    PyModule_AddObject(m, "B", (PyObject*)&ClassyType);

    return m;
}

PyObject* importModule(std::string name)
{
    PyObject* pModule = PyImport_ImportModule(name.c_str());    // module name
    if (pModule == nullptr)
    {
        std::cout << "load module error!" << std::endl;
        return nullptr;
    }

    return pModule;
}

PyObject* callFunction(PyObject* pModule, std::string name, PyObject* args = nullptr)
{
    PyObject* pFunction = PyObject_GetAttrString(pModule, name.c_str());    // function name
    if (pFunction == nullptr)
    {
        std::cout << "call function error!" << std::endl;
        return nullptr;
    }

    return PyObject_CallObject(pFunction, args);
}

int main()
{
    // add module
    PyImport_AppendInittab("example", PyInit_example);

    // init python
    Py_Initialize();
    {
        PyRun_SimpleString("import sys");
        PyRun_SimpleString("import os");
        PyRun_SimpleString("sys.path.append(os.getcwd() + '\\script')");    // add script path
    }

    // import module
    PyImport_ImportModule("example");

    PyObject* pModule = importModule("Test");
    if (pModule != nullptr)
    {
        PyObject* pReturn = callFunction(pModule, "main");
    }

    PyErr_Print();

    Py_Finalize();

    system("pause");
    return 0;
}

I assume the OP is using the CPython API. ( We use CPython and parts of code look veeery similar/familiar.)

As the name already says, it's written in C.

So, when using it to write a Python binding for C++ classes, the developer must be aware that CPython and it's C API doesn't “know” anything about C++. This must be considered carefully (similar as if writing a C binding for a C++ class library).

When I write Python Wrapper classes, I do it always with struct s (to remember myself to this fact). It is possible to use C++ inheritance in the CPython's wrappers to resemble the inheritance of the wrapped C++ classes (but that's the only exception from my above rule).

struct and class are rather the same thing in C++ with the (only) exception that everything is public in a struct by default but private in a class . SO: Class vs Struct for data only? Btw. CPython will access it's resp. member variables structure components (eg ob_base ) by C pointer casts (reinterpret casts) and will even not recognize the private -safety-attempts.

IMHO, it's worth to mention the term POD ( plain old data , also called passive data structure ) because this is what makes the C++ wrapper classes compatible with C. SO: What are Aggregates and PODs and how/why are they special? gives an encompassing overview for this.

Introducing at least one virtual member function in a CPython wrapper class has fatal consequences. Reading the above link carefully makes this clear. However, I decided to illustrate this by a little sample code:

#include <iomanip>
#include <iostream>

// a little experimentation framework:

struct _typeobject { }; // replacement (to keep it simple)
typedef size_t Py_ssize_t; // replacement (to keep it simple)

// copied from object.h of CPython:
/* Define pointers to support a doubly-linked list of all live heap objects. */
#define _PyObject_HEAD_EXTRA            \
    struct _object *_ob_next;           \
    struct _object *_ob_prev;

// copied from object.h of CPython:
/* Nothing is actually declared to be a PyObject, but every pointer to
 * a Python object can be cast to a PyObject*.  This is inheritance built
 * by hand.  Similarly every pointer to a variable-size Python object can,
 * in addition, be cast to PyVarObject*.
 */
typedef struct _object {
  _PyObject_HEAD_EXTRA
  Py_ssize_t ob_refcnt;
  struct _typeobject *ob_type;
} PyObject;

/* PyObject_HEAD defines the initial segment of every PyObject. */
#define PyObject_HEAD                   PyObject ob_base;

void dump(std::ostream &out, const char *p, size_t size)
{
  const size_t n = 16;
  for (size_t i = 0; i < size; ++p) {
    if (i % n == 0) {
      out << std::hex << std::setw(2 * sizeof p) << std::setfill('0')
        << (size_t)p << ": ";
    }
    out << ' '
      << std::hex << std::setw(2) << std::setfill('0')
      << (unsigned)*(unsigned char*)p;
    if (++i % n == 0) out << '\n';
  }
  if (size % n != 0) out << '\n';
}

// the experiment:

static PyObject pyObj;

// This is correct:
struct Wrapper1 {
  PyObject_HEAD
  int myExt;
};
static Wrapper1 wrap1;

// This is possible:
struct Wrapper1Derived: Wrapper1 {
  double myExtD;
};
static Wrapper1Derived wrap1D;

// This is effectively not different from struct Wrapper1
// but things are private in Wrapper2
// ...and Python will just ignore this (using C pointer casts).
class Wrapper2 {
  PyObject_HEAD
  int myExt;
};
static Wrapper2 wrap2;

// This is FATAL - introduces a virtual method table.
class Wrapper3 {
  private:
    PyObject_HEAD
    int myExt;
  public:
    Wrapper3(int value): myExt(value) { }
    virtual ~Wrapper3() { myExt = 0; }
};
static Wrapper3 wrap3{123};

int main()
{
  std::cout << "Dump of PyObject pyObj:\n";
  dump(std::cout, (const char*)&pyObj, sizeof pyObj);
  std::cout << "Dump of Wrapper1 wrap1:\n";
  dump(std::cout, (const char*)&wrap1, sizeof wrap1);
  std::cout << "Dump of Wrapper1Derived wrap1D:\n";
  dump(std::cout, (const char*)&wrap1D, sizeof wrap1D);
  std::cout << "Dump of Wrapper2 wrap2:\n";
  dump(std::cout, (const char*)&wrap2, sizeof wrap2);
  std::cout << "Dump of Wrapper3 wrap3:\n";
  dump(std::cout, (const char*)&wrap3, sizeof wrap3);
  return 0;
}

Compiled and ran:

Dump of PyObject pyObj:
0000000000601640:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000601650:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Dump of Wrapper1 wrap1:
0000000000601600:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000601610:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000601620:  00 00 00 00 00 00 00 00
Dump of Wrapper1Derived wrap1D:
00000000006015c0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000000006015d0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000000006015e0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Dump of Wrapper2 wrap2:
0000000000601580:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000601590:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000000006015a0:  00 00 00 00 00 00 00 00
Dump of Wrapper3 wrap3:
0000000000601540:  d8 0e 40 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000601550:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000601560:  00 00 00 00 00 00 00 00 7b 00 00 00 00 00 00 00

Live Demo on coliru

The dumps of pyObj , wrap1 , wrap1D , wrap2 consists of 00 s only – no wonder, I made them static . wrap3 looks a bit different, partly because of the constructor ( 7b == 123) and partly because the C++ compiler put a VMT ponter into the class instance to which d8 0e 40 very probably belongs to. (I assume that a VMT pointer has the size of any function pointer but I don't know really how the compiler organizes things internally.)

Imagine what happens when CPython takes the address of wrap3 , casts it to PyObject* , and writes the _ob_next pointer which has offset 0 and is used to chain Python objects into a double-linked list. (Hopefully a crash or something else which makes things even worse.)

Imagine in turn what happens in OP's create function

static PyObject *B_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    B *self = (B*)type->tp_alloc(type, 0);
    new (self)B;
    return (PyObject*)self;
}

when the placement constructor of B overrides the initialization of the PyObject internals which probably happened in tp_alloc() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM