使用shared_ptr时的SEGFAULT

Question

I'm trying to implement the Lazy Concurrent List-based Set in C++ by using shared_ptr . 我正在尝试使用shared_ptr在C ++中实现基于Lazy Concurrent List的Set 。 My reasoning is that unreachable nodes will be automatically freed by the last shared_ptr . 我的理由是， unreachable nodes将被最后一个shared_ptr自动释放。 As per my understanding, increment and decrement operation on a shared_ptr's reference count is atomic. 根据我的理解，对shared_ptr's reference count递增和递减操作是原子的。 Which means only the last shared_ptr with reference to the node should call delete/free for that node. 这意味着只有最后一个引用该节点的shared_ptr应该为该节点调用delete / free 。 I ran the program for multiple threads , but my program is crashing with the error double free called or just Segmentation Fault(SIGSEGV) . 我为多个线程运行程序，但我的程序崩溃时出现错误double free called或只是分段错误（SIGSEGV） 。 I don't understand how this is possible. 我不明白这是怎么可能的。 Given below is my code for the implementation, with the method names signifying their intended operation. 下面给出了我的实现代码，方法名称表示它们的预期操作。

#include<thread>
#include<iostream>
#include<mutex>
#include<climits>

using namespace std;

class Thread
{
   public:
      std::thread t;  
};
int n=50,ki=100,kd=100,kc=100;`/*no of threads, no of inserts,deletes & searches*/`


class Node
{
public:
      int key;
      shared_ptr<Node> next;
      bool marked;
      std::mutex nodeLock;

      Node() {
         key=0;
         next = nullptr;
         marked = false;
      }

      Node(int k) {
         key = k;
         next = nullptr;
         marked = false;
      }

      void lock() {
         nodeLock.lock();
      }

      void unlock() {
         nodeLock.unlock();
      }

      ~Node()
      {
      }
};

class List {
   shared_ptr<Node> head;
   shared_ptr<Node> tail;

public:

   bool validate(shared_ptr<Node> pred, shared_ptr<Node> curr) {
      return !(pred->marked) && !(curr->marked) && ((pred->next) == curr);
   }

   List() {
      head=make_shared<Node>(INT_MIN);
      tail=make_shared<Node>(INT_MAX);
      head->next=tail;
   }

   bool add(int key)
   {
      while(true)
      {
         /*shared_ptr<Node> pred = head;
         shared_ptr<Node> curr = pred->next;*/
        auto pred = head;
        auto curr = pred->next;

         while (key>(curr->key))
         {
            pred = curr;
            curr = curr->next;
         }

         pred->lock();
         curr->lock();

         if (validate(pred,curr))
         {
            if (curr->key == key)
            {
               curr->unlock();
               pred->unlock();
               return false;
            }
            else
            {
                shared_ptr<Node> newNode(new Node(key));
               //auto newNode = make_shared<Node>(key);
                //shared_ptr<Node> newNode = make_shared<Node>(key);
                newNode->next = curr;
                pred->next = newNode;
                curr->unlock();
                pred->unlock();
                return true;
            }
         }
         curr->unlock();
         pred->unlock();
      }
   }

   bool remove(int key)
   {
      while(true)
      {
         /*shared_ptr<Node> pred = head;
         shared_ptr<Node> curr = pred->next;*/

        auto pred = head;
        auto curr = pred->next;

         while (key>(curr->key))
         {
            pred = curr;
            curr = curr->next;
         }

         pred->lock();
         curr->lock();

         if (validate(pred,curr))
         {
            if (curr->key != key)
            {
               curr->unlock();
               pred->unlock();
               return false;
            }
            else
            {
               curr->marked = true;
               pred->next = curr->next;
               curr->unlock();
               pred->unlock();
               return true;
            }
         }
         curr->unlock();
         pred->unlock();
      }
   }

   bool contains(int key) {
      //shared_ptr<Node> curr = head->next;
    auto curr = head->next;

      while (key>(curr->key)) {
         curr = curr->next;
      }
      return curr->key == key && !curr->marked;
   }
}list;

void test(int curr)
{
   bool test;
    int time;

    int val, choice;
    int total,k=0;
    total=ki+kd+kc;

    int i=0,d=0,c=0;

    while(k<total)
    {
        choice = (rand()%3)+1;

        if(choice==1)
        {
            if(i<ki)
            {
                val = (rand()%99)+1;
                test = list.add(val);
                i++;
                k++;
            }
        }
        else if(choice==2)
        {
            if(d<kd)
            {
                val = (rand()%99)+1;
                test = list.remove(val);
                d++;
                k++;
            }
        }
        else if(choice==3)
        {
            if(c<kc)
            {
                val = (rand()%99)+1;
                test = list.contains(val);
                c++;
                k++;
            }
        }
    }
}

int main()
{
   int i;

   vector<Thread>thr(n);

   for(i=0;i<n;i++)
   {
      thr[i].t = thread(test,i+1);
   }
   for(i=0;i<n;i++)
   {
      thr[i].t.join();
   }
   return 0;
}

I'm not able to figure out what's wrong with the above code. 我无法弄清楚上面的代码有什么问题。 The errors differ every time, some of which are just SEGFAULTS or 错误每次都不同，其中一些只是SEGFAULTS或

pure virtual method called
terminate called without an active exception
Aborted (core dumped)

Could you please point out what I'm doing wrong in the above code? 你能否指出我在上面的代码中做错了什么？ And how to fix that error? 以及如何解决这个错误？
EDIT: Added a very crude test function which randomly calls the three list methods . 编辑：添加了一个非常粗略的test function ，随机调用三个list methods 。 Also, number of threads and number of each operations are declared globally. 此外，全局声明线程数和每个操作的数量。 Crude programming, but it recreates the SEGFAULT . 粗体编程，但它重新创建了SEGFAULT 。

Answer 1

The issue is that you're not using the atomic store and load operations for your shared_ptr s. 问题是您没有使用shared_ptr的原子库和加载操作。

It is true that the reference count in the control block (to which each shared_ptr participating in ownership of a particular shared object has a pointer to) of a shared_ptr is atomic, however, the data members of the shared_ptr itself aren't. 这是事实，在控制块的引用计数（对每个shared_ptr参与特定共享对象的所有权有一个指针指向）一个的shared_ptr是原子，然而，的数据成员shared_ptr本身都没有。

Thus it is safe to have multiple threads each with their own shared_ptr to a shared object, but it is not save to have multiple threads access the same shared_ptr as soon as at least one of them is using a non-const member function, which is what you're doing when reassigning the next pointer. 因此，将多个线程各自拥有自己的shared_ptr到共享对象是安全的，但是只要至少其中一个线程使用非const成员函数，就不shared_ptr多个线程访问同一个shared_ptr在重新分配next指针时你正在做什么。

Illustrating the problem 说明问题

Let's look at a the (simplified and prettified) copy-constructor of libstdc++'s shared_ptr implementation: 让我们看一下libstdc ++的shared_ptr实现的（简化和美化）复制构造函数：

shared_ptr(const shared_ptr& rhs)
 : m_ptr(rhs.m_ptr),
   m_refcount(rhs.m_refcount) 
{ }

Here m_ptr is just a raw pointer to the shared object, and m_refcount is a class that does the reference counting and also handles eventual deletion of the object m_ptr points to. 这里m_ptr只是一个指向共享对象的原始指针，而m_refcount是一个执行引用计数的类，并且还处理m_ptr指向的对象的最终删除。

Just one example of what can go wrong: Assume that currently a thread is trying to figure out whether a particular key is contained in the list. 可能出现问题的一个例子：假设当前一个线程试图弄清楚列表中是否包含特定键。 It starts with the copy-initialization auto curr = head->next in List::contains . 它从List::contains的copy-initialization auto curr = head->next开始。 Just after it managed to initialize curr.m_ptr the OS scheduler decides this thread has to pause and schedules in another thread. 在它设法初始化curr.m_ptr ，OS调度程序决定该线程必须暂停并在另一个线程中进行调度。

That other thread is removing the successor of head . 那个其他线程正在移除head的继任者。 Since the ref-count of head->next still is 1 (after all, the ref-count of head->next wasn't modified by thread 1 yet), when the second thread is done removing the node it is being deleted. 由于head->next的ref-count仍然是1（毕竟， head->next的ref-count尚未被线程1修改），当第二个线程完成删除节点时，它将被删除。

Then some time later the first thread continues. 然后一段时间后第一个线程继续。 It completes the initialization of curr , but since m_ptr was already initialized before thread 2 started the deletion, it still points to the now deleted node. 它完成了curr的初始化，但由于m_ptr在线程2开始删除之前已经初始化，它仍然指向现在删除的节点。 When trying to compare key > curr->key thread 1 will access invalid memory. 当尝试比较key > curr->key线程1将访问无效的内存。

Using std::atomic_load and std::atomic_store to prevent the issue 使用std :: atomic_load和std :: atomic_store来防止这个问题

std::atomic_load and std::atomic_store prevent the issue from occurring by locking a mutex before the call to the copy-constructor/copy-assignment-operator of the shared_ptr that is passed in by pointer. std::atomic_load和std::atomic_store通过在调用指针传入的shared_ptr的copy-constructor / copy-assignment-operator之前锁定互斥锁std::atomic_store防止发生此问题。 If all reads from and writes to shared_ptr s that are shared across multiple threads are through std::atomic_load / std::atomic_store resp. 如果对多个线程共享的shared_ptr的所有读取和写入都是通过std::atomic_load / std::atomic_store resp。 it can never happen that one thread has only modified m_ptr but not the reference count at the time another thread starts reading or modifying the same shared_ptr . 在另一个线程开始读取或修改相同的shared_ptr ，一个线程只修改了m_ptr而不是引用计数。

With the necessary atomic loads and stores the List member functions should read as follows: 使用必要的原子加载和存储， List成员函数应如下所示：

bool validate(Node const& pred, Node const& curr) {
   return !(pred.marked) && !(curr.marked) && 
          (std::atomic_load(&pred.next).get() == &curr);
}

bool add(int key) {
    while (true) {
        auto pred = std::atomic_load(&head);
        auto curr = std::atomic_load(&pred->next);

        while (key > (curr->key)) {
            pred = std::move(curr);
            curr = std::atomic_load(&pred->next);
        }

        std::scoped_lock lock{pred->nodeLock, curr->nodeLock};
        if (validate(*pred, *curr)) {
            if (curr->key == key) {
                return false;
            } else {
                auto new_node = std::make_shared<Node>(key);

                new_node->next = std::move(curr);
                std::atomic_store(&pred->next, std::move(new_node));
                return true;
            }
        }
    }
}

bool remove(int key) {
    while (true) {
        auto pred = std::atomic_load(&head);
        auto curr = std::atomic_load(&pred->next);

        while (key > (curr->key)) {
            pred = std::move(curr);
            curr = std::atomic_load(&pred->next);
        }

        std::scoped_lock lock{pred->nodeLock, curr->nodeLock};
        if (validate(*pred, *curr)) {
            if (curr->key != key) {
                return false;
            } else {
                curr->marked = true;
                std::atomic_store(&pred->next, std::atomic_load(&curr->next));
                return true;
            }
        }
    }
}

bool contains(int key) {
    auto curr = std::atomic_load(&head->next);

    while (key > (curr->key)) {
        curr = std::atomic_load(&curr->next);
    }
    return curr->key == key && !curr->marked;
}

Additionally, you should also make Node::marked a std::atomic_bool . 另外，您还应该将Node::marked为std::atomic_bool 。

使用shared_ptr时的SEGFAULT

问题描述

1 个解决方案

解决方案1
5 2018-02-11 01:01:43

Illustrating the problem 说明问题

Using std::atomic_load and std::atomic_store to prevent the issue 使用std :: atomic_load和std :: atomic_store来防止这个问题

使用shared_ptr时的SEGFAULT

问题描述

1 个解决方案

解决方案1 5 2018-02-11 01:01:43

Illustrating the problem 说明问题

Using std::atomic_load and std::atomic_store to prevent the issue 使用std :: atomic_load和std :: atomic_store来防止这个问题

解决方案1
5 2018-02-11 01:01:43