簡體   English   中英

並發雙鏈表-多個生產者/消費者FIFO隊列-死鎖

[英]Concurrent Doubly Linked List - Multiple Producer/Consumer FIFO Queue - Deadlock

我正在處理MPMC FIFO隊列的雙向鏈接列表(主要用於演示)。 我現在的目標是試圖糾正我的錯誤一段時間,但我並沒有真正取得進展。

在所有Producer-Threads產生所有值后不久,我陷入了僵局。 我將問題歸結為以下內容:

鎖獲取:

<THREAD ID> | <MESSAGE> | <MUTEX ADDRESS>

4460175360 : TRYLOCK HEAD:   0x7fb380c039d0
4460175360 : LOCKED HEAD:     0x7fb380c039d0
4460175360 : RELEASED HEAD: 0x7fb380c039d0

4460175360 : TRYLOCK HEAD:    0x7fb380c039d0
4460175360 : LOCKED HEAD:      0x7fb380c039d0
4460175360 : RELEASED HEAD:  0x7fb380c039d0

4460175360 : TRYLOCK HEAD:    0x7fb380c039d0
4460175360 : LOCKED HEAD:      0x7fb380c039d0
4460175360 : RELEASED HEAD:  0x7fb380c039d0

4460175360 : TRYLOCK HEAD:    0x7fb380c039d0
4459638784 : TRYLOCK TAIL:       0x7fb380c039d0
4460711936 : TRYLOCK TAIL:       0x7fb380c039d0
4460175360 : LOCKED HEAD:      0x7fb380c039d0
4460175360 : RELEASED HEAD:  0x7fb380c039d0

4460175360 : TRYLOCK HEAD: 0x7fb380c039d0   <---- THIS TRY-LOCK-HEAD NEVER GETS CONFIRMED!
4459638784 : LOCKED TAIL:  0x7fb380c039d0 
4459638784 : RELEASE MUTEX: 0x7fb380c039d0 <---- THE PRODUCER THREAD RELEASED 0x7fb380c039d0
Producer-Thread 4459638784 produced: 0

---- NOW ONE ELEMENT IN QUEUE ----

4460711936 : LOCKED TAIL:  0x7fb381800020 <---- ADDRESS OF LOCK HAS CHANGED WHICH IS FINE (because an element has been added)
4460711936 : RELEASE MUTEX AT ADDR: 0x7fb381800020
Producer-Thread 4460711936 produced: 0

---- NOW TWO ELEMENTS IN QUEUE ----

4460711936 : TRYLOCK TAIL: 0x7fb381a00020
4459638784 : TRYLOCK TAIL: 0x7fb381a00020
4460711936 : LOCKED TAIL:   0x7fb381a00020
4460711936 : RELEASE MUTEX AT ADDR: 0x7fb381a00020
Producer-Thread 4460711936 produced: 1

---- NOW THREE ELEMENTS IN QUEUE ----

4459638784 : LOCKED TAIL: 0x7fb380d00060  <---- AGAIN ADDRESS HAS CHANGED -- FINE
4459638784 : RELEASE MUTEX AT ADDR: 0x7fb380d00060
Producer-Thread 4459638784 produced: 1

---- NOW FOUR ELEMENTS IN QUEUE ---- PRODUCERS ARE DONE ----
---- CONSUMER THREAD BLOCKS - STILL TRYING TO LOCK 0x7fb380c039d0 ----

---- BUT NOBODY ELSE HOLDS 0x7fb380c039d0 ---- ALSO NO SELF-DEADLOCK?

GDB告訴我,兩個生產者線程都已終止,並且唯一在addr 0x7fb380c039d0等待mutx的線程是線程2(唯一的使用者線程):

    (gdb) info threads
* 2                         0x00007fff8edc5dfd in pthread_mutex_lock ()
  1 "com.apple.main-thread" 0x00007fff8e715386 in __semwait_signal ()
(gdb) bt
#0  0x00007fff8e715122 in __psynch_mutexwait ()
#1  0x00007fff8edc5dfd in pthread_mutex_lock ()
#2  0x00000001000011a8 in ConcurrentDoublyLinkedList<int>::consumeNode (this=0x1001000e0) at ConcurrentDList.h:142
#3  0x0000000100000bd4 in consumeValues (ctx=0x7fff5fbffa78) at ConcurrentDList.cpp:29
#4  0x00007fff8edc07a2 in _pthread_start ()
#5  0x00007fff8edad1e1 in thread_start ()
(gdb) f 2
#2  0x00000001000011a8 in ConcurrentDoublyLinkedList<int>::consumeNode (this=0x1001000e0) at ConcurrentDList.h:142
142             pthread_mutex_lock(&head_->mutex);
(gdb) info reg
rax            0x200012d    33554733
rbx            0x100281000  4297592832
rcx            0x100280da8  4297592232
rdx            0x100    256
rsi            0x403    1027
rdi            0x1001039f0  4296030704
rbp            0x100280ed0  0x100280ed0
rsp            0x100280e00  0x100280e00
r8             0x2060   8288
r9             0x100280720  4297590560
r10            0xf9d79  1023353
r11            0x206    518
r12            0x1303   4867
r13            0x0  0
r14            0x7fff5fbffa78   140734799805048
r15            0x100000bb0  4294970288
rip            0x1000011a8  0x1000011a8 <ConcurrentDoublyLinkedList<int>::consumeNode()+110>
eflags         0x206    518
cs             0x7  7
ss             0x0  0
ds             0x0  0
es             0x0  0
fs             0x0  0
gs             0x0  0
(gdb) p *(pthread_mutex_t*) 0x1001039f0
$34 = {
  __sig = 1297437784, 
  __opaque = "\000\000\000\000` \000\000\000\000\000\000\000\000\000\000\003\004\000\000\000\002\000\000{?\017\000\000\000\000\000\b:\020\000\001\000\000\000\f:\020\000\001\000\000\000\000\000\000\000\000\000\000"
}

不幸的是,我無法捕獲pthread_mutex_t,因為它是不透明的類型。 (盡管我在GDB中啟用了不透明類型?)。 否則,最好查看該互斥對象的所有者是誰。

下面列出了我的列表代碼-對其進行了過多注釋,以解釋我對實現的看法。 請注意,到目前為止,這不是一個好的或高性能的實現,主要是出於說明目的。

CPP代碼:

#include <stdlib.h>
#include <iostream>

#include <pthread.h>

template <class T>
class Node
{
    private:
       T value_;

       Node<T> *next_;
       Node<T> *prev_;

    public:

        pthread_mutex_t mutex;

        Node(T value, Node<T>* next, Node<T>* prev)
        {
            value_ = value;

            next_ = next;
            prev_ = prev;

            pthread_mutex_init(&mutex, NULL);
        }

        ~Node()
        {
            pthread_mutex_destroy(&mutex);
        }

        void setNext(Node<T>* next)
        {
            next_ = next;
        }        

        void setPrevious(Node<T>* prev)
        {
            prev_ = prev;
        }

        Node<T>* getNext()
        {
            return next_;
        } 

        Node<T>* getPrevious()
        {
            return prev_;
        }


        virtual bool isSentinel()
        {
            return false;
        } 
};

template <class T>
class SentinelNode : public Node<T>
{
    public:
      SentinelNode(T val, Node<T>* next, Node<T>* prev)
          : Node<T>(val, next, prev)
      {
      }

      virtual bool isSentinel()
      {
          return true;
      }
};

template <class T>
class ConcurrentDoublyLinkedList
{
    private:
        Node<T> *head_;
        Node<T> *tail_;

    public: 
        ConcurrentDoublyLinkedList()
        {
            head_ = tail_ = new SentinelNode<T>(NULL, NULL, NULL);
        }

        void produceNode(Node<T>* n)
        {  
            pthread_mutex_lock(&tail_->mutex);

            Node<T>* old = tail_;

            if(head_->isSentinel())
            {
                head_ = tail_ = n;
                pthread_mutex_unlock(&old->mutex);

                delete old;
                old = NULL;
            }
            else
            {
                n->setNext(tail_);
                tail_->setPrevious(n);
                tail_ = n;

                pthread_mutex_unlock(&old->mutex);
            }
        }

        Node<T>* consumeNode()
        {
            pthread_mutex_lock(&head_->mutex);

            if(head_->isSentinel())
            {
                 /* the head can only be a Sentinel if there is
                  * no element within the list
                  */

                  /* this also means that the list is currently
                   * implicitly fully locked */

                  /* return NULL and let the thread decide what to do */

                  pthread_mutex_unlock(&head_->mutex);

                  return NULL;
            }
            else
            {
                /* the head is not a Sentinel, which implies on of the following:
                 *     - The list has exactly one element, which means:
                 *         => head_ == tail_ && !head_->isSentinel()
                 *     
                 *     OR
                 *     
                 *     - The list has at least two elements, which means:
                 *         => head_ != tail_ && !head_->isSentinel()
                 *
                 *     - The absence of a Sentinel guarantees that the list
                 *       is NOT empty
                 */

                 if(head_ == tail_)
                 {
                      /* single element within the list 
                       * the list is still fully locked, 
                       * implicit because head_ == tail_
                       */

                       /* we replace the only element
                        * with a Sentinel, because
                        * the list is empty afterwards
                        */

                        Node<T>* p = head_;
                        head_ = tail_ = new SentinelNode<T>(NULL, NULL, NULL);

                        pthread_mutex_unlock(&p->mutex);

                        return p;
                 }
                 else
                 {
                      /* at least two elements are in the list
                       * which means that the current head
                       * must have a valid predecessor 
                       */

                       /* Producer and Consumer could
                        * hold a lock on two adjacent
                        * nodes. Thus, the Consumer
                        * must acquire head->prev
                        */

                        Node<T>* p = head_;
                        Node<T>* n = head_->getPrevious();

                        pthread_mutex_lock(&n->mutex);

                        /* head_->prev can now not be owned 
                         * by a producer. 
                         */

                        head_ = n;
                        head_->setNext(NULL);

                        pthread_mutex_unlock(&p->mutex);
                        pthread_mutex_unlock(&n->mutex);

                        return p;
                 } 
            }
        }
};

非常感謝您對這些東西的想法。 我現在正在看它一段時間,也可能完全走錯了路。 感謝您的幫助。

謝謝塞巴斯蒂安

假設節點之后仍然是頭尾,那么消費和生產都鎖定頭尾節點。 也就是說,線程可能具有不再是頭節點或尾節點的鎖定節點。 如果在鎖定之前存儲了一個“舊”指針,那么您至少會知道鎖定了哪個節點,但是鎖定的節點可能已經被使用,並且某些邏輯可能不再成立。

我認為鎖定頭和尾指針比鎖定實際節點要容易得多,假設您只需要產生和消費方法。 當然,最簡單的解決方案是鎖定整個數據結構。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM