简体   繁体   中英

Cython: Memory ownership issue between Python/C++ when storing pointers

I'm having a memory ownership issue between Python/C++ code using Cython. I'm not sure what is the best way to address this. So I appreciate any help.

My strategy is as follows:

  1. Create a C++ NetInfo data structure with get/set members.
  2. Populate this data structure in Python (the data is parsed from files and SqLiteDB lookup is done). Thus, I need to cythonize NetInfo.
  3. Pass the NetInfo objects into C++ 'processing' routines to operate on them. These will also be cythonized.

For a NetInfo object, I need to store pointers to other NetInfo objects to indicate interaction between two objects.

My (relevant) C++ code is as follows:

struct NetInfoData;
class NetInfo {
public:
    NetInfo();
    NetInfo(NetInfo const& rhs);
    virtual ~NetInfo();
    ...
    std::vector<NetInfo*> getBridgedNets() const;
    void addBridgedNet(NetInfo const* ni);  
protected:
      NetInfoData* data_;
};
struct NetInfoData
{
     std::string name;
     ...
     std::vector<NetInfo*>  bridged_nets; <-- NOTE: Storing pointers.
};
NetInfo::NetInfo()
    : data_(0)
{
    std::cout << "Constructor " << this << std::endl;
    data_ = new NetInfoData();
}
NetInfo::~NetInfo()
{
    std::cout << "Destructor " << this << std::endl;
    delete data_;
}
NetInfo::NetInfo(NetInfo const& rhs)
    : data_(0)
{
    std::cout << "Copy constructor " << this << std::endl;
    data_ = new NetInfoData();
    data_->name = rhs.data_->name;
    ...
    data_->bridged_nets = rhs.data_->bridged_nets;
}
std::vector<NetInfo*>
NetInfo::getBridgedNets() const
{
    return data_->bridged_nets;
}

void
NetInfo::addBridgedNet(NetInfo* n)
{
    data_->bridged_nets.push_back(n);
}

My (relevant) Cython code is as follows. It compiles/works OK.

from cython.operator cimport dereference as deref
from libcpp.vector cimport vector

cdef extern from 'NetInfo.h':
    cdef cppclass NetInfo:
        NetInfo() except +
        NetInfo(NetInfo&) except +
        ...
        vector[NetInfo*] getBridgedNets()
        void             addBridgedNet(NetInfo*)

cdef class PyNetInfo:
    cdef NetInfo* thisptr

    def __cinit__(self, PyNetInfo ni=None):
        if ni is not None:
            self.thisptr = new NetInfo(deref(ni.thisptr))
        else:
            self.thisptr = new NetInfo()
    def __dealloc__(self):
        del self.thisptr
    ...
    def get_bridged_nets(self):
        cdef PyNetInfo r
        cdef NetInfo* n
        cdef vector[NetInfo*] nets = self.thisptr.getBridgedNets()

        result = []
        for n in nets:
            r = PyNetInfo.__new__(PyNetInfo)
            r.thisptr = n
            result.append(r)
        return result

    def add_bridged_net(self, PyNetInfo ni):
        self.thisptr.addBridgedNet(ni.thisptr)

Now my Python pseudo-code is as follows:

import PyNetInfo as NetInfo

a = NetInfo()               # Create a
Update data members of a    # Populate a

tmp = NetInfo(a)     # Call copy constructor of a
for n in xrange(5):  # a interacts with five other NetInfo objects so create and call them to a via add_bridged_net() 
   x = NetInfo(tmp)  # Call copy constructor to make copy of tmp (not a!!)
   Update data members of x

   a.add_bridged_net(x)   # Store pointer to x in a (a is updated!)

The offending piece of code is x = NetInfo(tmp) . On the 2nd iteration , the old memory assigned to x will be released as x is now pointing to aa new object. This will cause a to contain an invalid pointer now.

Sample run :

create a
Constructor 0x101ecd0

create tmp
Copy constructor 0xd71d30

create bridge x
Copy constructor 0xd71bb0
add bridged net:  

create bridge x
Copy constructor 0xc9f740
Destructor 0xd71bb0   <--- Destructor on old x is called due to reassignment which causes a to contain an invalid pointer (hence, eventually segfault)
add bridged net:

I'm not totally sure how to manage the memory to fix this. Can anyone help?

I'm thinking maybe using shared pointers? So in my C++ code, I say

typedef std::shared_ptr<NetInfo> NetInfoShPtr;

Then,

std::vector<NetInfo*> bridged_nets -> std::vector<NetInfoShPtr> bridged_nets;

But then I'm not sure what to do on the cython side of things. Will this work or there is some other (easier?) way? Thanks for any ideas.

I was able to solve this issue using shared pointers (let it do all the dirty work of managing). The only hassle is now need to use a lot of deref(self.thisptr) everywhere in Cython to call the C++ get/set methods :).

C++ change:

class NetInfo
typedef std::shared_ptr<NetInfo> NetInfoShPtr;

class NetInfo {
public:
    NetInfo();
    NetInfo(NetInfo const& rhs);
    virtual ~NetInfo();
    ...
    std::vector<NetInfoShPtr> getBridgedNets() const;
    void addBridgedNet(NetInfoShPtr const& ni);  
protected:
      NetInfoData* data_;
};

Cython change:

from cython.operator cimport dereference as deref
from libcpp.vector cimport vector
from libcpp.memory cimport shared_ptr

cdef extern from 'NetInfo.h':
    ctypedef shared_ptr[NetInfo] NetInfoShPtr

    cdef cppclass NetInfo:
        NetInfo() except +
        NetInfo(NetInfo&) except +
        ...
        vector[NetInfoShPtr] getBridgedNets()
        void                 addBridgedNet(NetInfoShPtr&)

cdef class PyNetInfo:
    cdef NetInfoShPtr thisptr

    def __cinit__(self, PyNetInfo ni=None):
        if ni is not None:
            self.thisptr = NetInfoShPtr(new NetInfo(deref(ni.thisptr)))
        else:
            self.thisptr = new NetInfoShPtr(new NetInfo())
    def __dealloc__(self):
        self.thisptr.reset()   # no del, reset the shared pointer
    ...
    def get_bridged_nets(self):
        cdef PyNetInfo r
        cdef NetInfoShPtr n
        cdef vector[NetInfoShPtr] nets = deref(self.thisptr).getBridgedNets()   # Must derefence

        result = []
        for n in nets:
            r = PyNetInfo.__new__(PyNetInfo)
            r.thisptr = n
            result.append(r)
        return result

    def add_bridged_net(self, PyNetInfo ni):
        deref(self.thisptr).addBridgedNet(ni.thisptr)  # Must dereference

When you do

a.add_bridged_net(x)

a reference to x is not stored, only a pointer to a NetInfo instance is added to the vector. Since python object x is not referenced, x will be deallocated and consequently the corresponding pointer to C++ NetInfo instance ie there will be a pointer in the vector that points to a deallocated object.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM