C++ 上 linux 上的段錯誤 std::condition_variable::notify_all()

Question

我試圖讓自己了解 C++11 更改的最新信息，並且我正在圍繞 std::queue 創建一個名為 SafeQueue 的線程安全包裝器。 我有兩個可能阻塞的條件，隊列已滿和隊列為空。 我為此使用 std::condition_variable 。 不幸的是，在 Linux 上，我的空狀態的 notify_all() 調用是段錯誤的。 它在帶有 clang 的 Mac 上運行良好。 它在 enqueue() 方法中出現段錯誤：

#ifndef mqueue_hpp
#define mqueue_hpp

#include <queue>
#include <mutex>

//////////////////////////////////////////////////////////////////////
// SafeQueue - A thread-safe templated queue.                       //
//////////////////////////////////////////////////////////////////////
template<class T>
class SafeQueue
{
public:
    // Instantiate a new queue. 0 maxsize means unlimited.
    SafeQueue(unsigned int maxsize = 0);
    ~SafeQueue(void);
    // Enqueue a new T. If enqueue would cause it to exceed maxsize,
    // block until there is room on the queue.
    void enqueue(const T& item);
    // Dequeue a new T and return it. If the queue is empty, wait on it
    // until it is not empty.
    T& dequeue(void);
    // Return size of the queue.
    size_t size(void);
    // Return the maxsize of the queue.
    size_t maxsize(void) const;
private:
    std::mutex m_mutex;
    std::condition_variable m_empty;
    std::condition_variable m_full;
    std::queue<T> m_queue;
    size_t m_maxsize;
};

template<class T>
SafeQueue<T>::SafeQueue(unsigned int maxsize) : m_maxsize(maxsize) { }

template<class T>
SafeQueue<T>::~SafeQueue() { }

template<class T>
void SafeQueue<T>::enqueue(const T& item) {
    // Synchronize.
    if ((m_maxsize != 0) && (size() == m_maxsize)) {
        // Queue full. Can't push more on. Block until there's room.
        std::unique_lock<std::mutex> lock(m_mutex);
        m_full.wait(lock);
    }
    {
        std::lock_guard<std::mutex> lock(m_mutex);
        // Add to m_queue and notify the reader if it's waiting.
        m_queue.push(item);
    }
    m_empty.notify_all();
}

template<class T>
T& SafeQueue<T>::dequeue(void) {
    // Synchronize. No unlock needed due to unique lock.
    if (size() == 0) {
        // Wait until something is put on it.
        std::unique_lock<std::mutex> lock(m_mutex);
        m_empty.wait(lock);
    }
    std::lock_guard<std::mutex> lock(m_mutex);
    // Pull the item off and notify writer if it's waiting on full cond.
    T& item = m_queue.front();
    m_queue.pop();
    m_full.notify_all();
    return item;
}

template<class T>
size_t SafeQueue<T>::size(void) {
    std::lock_guard<std::mutex> lock(m_mutex);
    return m_queue.size();
}

template<class T>
size_t SafeQueue<T>::maxsize(void) const {
    return m_maxsize;
}

#endif /* mqueue_hpp */

顯然我做錯了什么，但我無法弄清楚。 Output 來自 gdb：

Core was generated by `./test'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000000000414739 in std::condition_variable::notify_all() ()
#2  0x00000000004054c4 in SafeQueue<int>::enqueue (this=0x7ffee06b3470,
    item=@0x7ffee06b355c: 1) at ../mqueue.hpp:59
#3  0x0000000000404ab6 in testsafequeue () at test.cpp:13
#4  0x0000000000404e99 in main () at test.cpp:49
(gdb) frame 2
#2  0x00000000004054c4 in SafeQueue<int>::enqueue (this=0x7ffee06b3470,
    item=@0x7ffee06b355c: 1) at ../mqueue.hpp:59
59          m_empty.notify_all();
(gdb) info locals
No locals.
(gdb) this.m_empty
Undefined command: "this.m_empty".  Try "help".
(gdb) print this->m_empty
$1 = {_M_cond = {__data = {{__wseq = 0, __wseq32 = {__low = 0,
          __high = 0}}, {__g1_start = 0, __g1_start32 = {__low = 0,
          __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0},
      __g1_orig_size = 0, __wrefs = 0, __g_signals = {0, 0}},
    __size = '\000' <repeats 47 times>, __align = 0}}

幫助表示贊賞。

Answer 1

在這兩個函數中，您使用一個鎖來等待條件變量，但是一旦等待結束，您就銷毀該鎖，而不是使用一個新鎖來實際操作隊列。

在獲取新鎖之間，另一個線程可能會獲取互斥鎖的鎖，例如，從隊列中刪除 object，原始線程打算從隊列中獲取，可能會在空隊列上調用front 。

您需要在每個 function 中取一個鎖，並執行其下的所有操作。 僅在執行wait時（自動）釋放鎖。

此外，從wait返回並不意味着notify_*被調用。 wait可能會虛假喚醒。 notify_all也可以通知多個線程關於一個可用的新元素。 您需要在循環中調用wait ，在退出之前檢查執行操作所需的條件。

wait還提供了一個重載，您可以使用該重載將條件作為第二個參數作為謂詞來避免顯式循環。

除此之外的線條

T& item = m_queue.front();
m_queue.pop();
//...
return item;

也會導致未定義的行為。 pop將破壞該item引用的 object，從而導致懸空引用。 使用返回的引用會導致未定義的行為。

您需要從隊列中復制/移動 object，而不是保留對它的引用：

T item = m_queue.front();
m_queue.pop();
//...
return item;

因此， dequeue也必須返回T ，而不是T& 。

Answer 2

首先，嘗試通過使用 mutex lock 包裝所有公共方法來使集合 class 線程安全是錯誤的。 請參閱我之前關於嘗試制作線程安全容器的固有競爭條件的關於該主題的回答。

至於您當前的代碼，這一行看起來很可疑：

T& item = m_queue.front();
m_queue.pop();

您的m_queue.front調用將返回對隊列中項目的引用，但pop方法肯定會破壞該引用。 當您的 function 嘗試返回item時，它將嘗試進行復制並遇到未定義的行為。

更好的：

T item = m_queue.front(); // make the copy to be returned
m_queue.pop();

至於您實施的 rest，您有幾個競爭條件和問題。 你永遠不應該僅僅因為wait返回就假設你等待的條件仍然有效（即競爭條件。）。 而是保持整個函數的鎖定。 當您調用wait時，它將釋放鎖，直到wait返回。

正如評論中提到的 user17732522 那樣，嘗試遞歸鎖定 std::mutex （通過在您的enqueue或dequeue方法中調用size() ）將是最確定的死鎖。 您可以使用遞歸互斥鎖，但最好避免這種模式。

改進如下：

void SafeQueue<T>::enqueue(const T& item) {

    std::unique_lock<std::mutex> lock(m_mutex);

    while ((m_maxsize != 0) && (m_queue.size() >= m_maxsize)) {
        // Queue full. Can't push more on. Block until there's room.
        m_full.wait(lock); // this will atomically unlock the mutex and wait for the cv to get notified
    }
    m_queue.push(item);
    m_empty.notify_all();
}

T SafeQueue<T>::dequeue(void) {

    std::unique_lock<std::mutex> lock(m_mutex);

    while (m_queue.size() == 0) {
        // Wait until something is put on it.
        m_empty.wait(lock);  // this will atomically unlock the mutex and wait for the cv 
    }

    // Pull the item off and notify writer if it's waiting on full cond.
    T item = m_queue.front();
    m_queue.pop();
    m_full.notify_all();
    return item;
}

同樣，我堅持我的第一段——創建線程安全容器是一種設計謬誤。 相反，讓使用非線程安全容器的場景線程安全！

C++ 上 linux 上的段錯誤 std::condition_variable::notify_all()

問題描述

2 個解決方案

解決方案1
1 2021-12-24 22:34:15

解決方案2
1 2021-12-24 22:41:48

C++ 上 linux 上的段錯誤 std::condition_variable::notify_all()

問題描述

2 個解決方案

解決方案1 1 2021-12-24 22:34:15

解決方案2 1 2021-12-24 22:41:48

解決方案1
1 2021-12-24 22:34:15

解決方案2
1 2021-12-24 22:41:48