简体   繁体   中英

Read-write thread-safe smart pointer in C++, x86-64

I develop some lock free data structure and following problem arises.

I have writer thread that creates objects on heap and wraps them in smart pointer with reference counter. I also have a lot of reader threads, that work with these objects. Code can look like this:

SmartPtr ptr;

class Reader : public Thread {
    virtual void Run {
       for (;;) {
           SmartPtr local(ptr);
           // do smth   
       }
    }   
};

class Writer : public Thread {
    virtual void Run {
       for (;;) {
           SmartPtr newPtr(new Object);    
           ptr = newPtr;  
       }
    }
};

int main() {
    Pool* pool = SystemThreadPool();
    pool->Run(new Reader());
    pool->Run(new Writer());
    for (;;) // wait for crash :(
}

When I create thread-local copy of ptr it means at least

  1. Read an address.
  2. Increment reference counter.

I can't do these two operations atomically and thus sometimes my readers work with deleted object.

The question is - what kind of smart pointer should I use to make read-write access from several threads with correct memory management possible? Solution should exist, since Java programmers don't even care about such a problem, simply relying on that all objects are references and are deleted only when nobody uses them.

For PowerPC I found http://drdobbs.com/184401888 , looks nice, but uses Load-Linked and Store-Conditional instructions, that we don't have in x86.

As far I as I understand, boost pointers provide such functionality only using locks. I need lock free solution.

boost::shared_ptr have atomic_store which uses a "lock-free" spinlock which should be fast enough for 99% of possible cases.

    boost::shared_ptr<Object> ptr;
class Reader : public Thread {
    virtual void Run {
       for (;;) {
           boost::shared_ptr<Object> local(boost::atomic_load(&ptr));
           // do smth   
       }
    }   
};

class Writer : public Thread {
    virtual void Run {
       for (;;) {
           boost::shared_ptr<Object> newPtr(new Object);    
           boost::atomic_store(&ptr, newPtr);
       }
    }
};

int main() {
    Pool* pool = SystemThreadPool();
    pool->Run(new Reader());
    pool->Run(new Writer());
    for (;;)
}

EDIT:

In response to comment below, the implementation is in "boost/shared_ptr.hpp"...

template<class T> void atomic_store( shared_ptr<T> * p, shared_ptr<T> r )
{
    boost::detail::spinlock_pool<2>::scoped_lock lock( p );
    p->swap( r );
}

template<class T> shared_ptr<T> atomic_exchange( shared_ptr<T> * p, shared_ptr<T> r )
{
    boost::detail::spinlock & sp = boost::detail::spinlock_pool<2>::spinlock_for( p );

    sp.lock();
    p->swap( r );
    sp.unlock();

    return r; // return std::move( r )
}

With some jiggery-pokery you should be able to accomplish this using InterlockedCompareExchange128. Store the reference count and pointer in a 2 element __int64 array. If reference count is in array[0] and pointer in array[1] the atomic update would look like this:

while(true)
{
    __int64 comparand[2];
    comparand[0] = refCount;
    comparand[1] = pointer;
    if(1 == InterlockedCompareExchange128(
        array,
        pointer,
        refCount + 1,
        comparand))
    {
        // Pointer is ready for use. Exit the while loop.
    }
}

If an InterlockedCompareExchange128 intrinsic function isn't available for your compiler then you may use the underlying CMPXCHG16B instruction instead, if you don't mind mucking around in assembly language.

The solution proposed by RobH doesn't work. It has the same problem as the original question: when accessing the reference count object, it might already have been deleted.

The only way I see of solving the problem without a global lock (as in boost::atomic_store) or conditional read/write instructions is to somehow delay the destruction of the object (or the shared reference count object if such thing is used). So zennehoy has a good idea but his method is too unsafe.

The way I might do it is by keeping copies of all the pointers in the writer thread so that the writer can control the destruction of the objects:

class Writer : public Thread {
    virtual void Run() {
        list<SmartPtr> ptrs; //list that holds all the old ptr values        

        for (;;) {
            SmartPtr newPtr(new Object);
            if(ptr)
                ptrs.push_back(ptr); //push previous pointer into the list
            ptr = newPtr;

            //Periodically go through the list and destroy objects that are not
            //referenced by other threads
            for(auto it=ptrs.begin(); it!=ptrs.end(); )
                if(it->refCount()==1)
                    it = ptrs.erase(it);
                else
                    ++it;
       }
    }
};

However there are still requirements for the smart pointer class. This doesn't work with shared_ptr as the reads and writes are not atomic. It almost works with boost::intrusive_ptr. The assignment on intrusive_ptr is implemented like this (pseudocode):

//create temporary from rhs
tmp.ptr = rhs.ptr;
if(tmp.ptr)
    intrusive_ptr_add_ref(tmp.ptr);

//swap(tmp,lhs)
T* x = lhs.ptr;
lhs.ptr = tmp.ptr;
tmp.ptr = x;

//destroy temporary
if(tmp.ptr)
    intrusive_ptr_release(tmp.ptr);

As far as I understand the only thing missing here is a compiler level memory fence before lhs.ptr = tmp.ptr; . With that added, both reading rhs and writing lhs would be thread-safe under strict conditions: 1) x86 or x64 architecture 2) atomic reference counting 3) rhs refcount must not go to zero during the assignment (guaranteed by the Writer code above) 4) only one thread writing to lhs (using CAS you could have several writers).

Anyway, you could create your own smart pointer class based on intrusive_ptr with necessary changes. Definitely easier than re-implementing shared_ptr. And besides, if you want performance, intrusive is the way to go.

The reason this works much more easily in java is garbage collection. In C++, you have to manually ensure that a value is not just starting to be used by a different thread when you want to delete it.

A solution I've used in a similar situation is to simply delay the deletion of the value. I create a separate thread that iterates through a list of things to be deleted. When I want to delete something, I add it to this list with a timestamp. The deleting thread waits until some fixed time after this timestamp before actually deleting the value. You just have to make sure that the delay is large enough to guarantee that any temporary use of the value has completed.

100 milliseconds would have been enough in my case, I chose a few seconds to be safe.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM