简体   繁体   English

使用boost :: atomic的无锁队列 - 我这样做了吗?

[英]Lock-Free Queue with boost::atomic - Am I doing this right?

Short version: 精简版:

I'm trying to replace std::atomic from C++11 used in a lock-free, single producer, single consumer queue implementation from here . 我试图从这里用于无锁,单生成器,单消费者队列实现中的C ++ 11替换std :: atomic。 How do I replace this with boost::atomic ? 如何用boost::atomic替换它?

Long version: 长版:

I'm trying to get a better performance out of our app with worker threads. 我正试图通过工作线程从我们的应用程序中获得更好的性能。 Each thread has its own task queue. 每个线程都有自己的任务队列。 We have to synchronize using lock before dequeue/enqueue each task. 我们必须在出列/排队每个任务之前使用锁同步。

Then I found Herb Sutter's article on lock-free queue. 然后我找到了Herb Sutter关于无锁队列的文章。 It seems like an ideal replacement. 这似乎是一个理想的替代品。 But the code uses std::atomic from C++11, which I couldn't introduce to the project at this time. 但是代码使用了C ++ 11中的std::atomic ,我现在无法将其引入到项目中。

More googling led to some examples, such as this one for Linux (echelon's) , and this one for Windows (TINESWARE's) . 更多的谷歌搜索导致了一些例子,例如这个用于Linux (echelon的)这个用于Windows(TINESWARE的) Both use platform's specific constructs such as WinAPI's InterlockedExchangePointer , and GCC's __sync_lock_test_and_set . 两者都使用平台的特定构造,如WinAPI的InterlockedExchangePointer和GCC的__sync_lock_test_and_set

I only need to support Windows & Linux so maybe I can get away with some #ifdef s. 我只需要支持Windows和Linux,所以也许我可以使用#ifdef s。 But I thought it might be nicer to use what boost::atomic provides. 但我认为使用boost::atomic提供的东西可能更好。 Boost Atomic is not part of official Boost library yet. Boost Atomic尚未成为官方Boost库的一部分。 So I downloaded the source from http://www.chaoticmind.net/~hcb/projects/boost.atomic/ and use the include files with my project. 所以我从http://www.chaoticmind.net/~hcb/projects/boost.atomic/下载了源代码,并在我的项目中使用了包含文件。

This is what I get so far: 这是我到目前为止所得到的:

#pragma once

#include <boost/atomic.hpp>

template <typename T>
class LockFreeQueue
{
private:
    struct Node
    {
        Node(T val) : value(val), next(NULL) { }
        T value;
        Node* next;
    };
    Node* first; // for producer only
    boost::atomic<Node*> divider;  // shared
    boost::atomic<Node*> last; // shared

public:
    LockFreeQueue()
    {
        first = new Node(T());
        divider = first;
        last= first;
    }

    ~LockFreeQueue()
    {
        while(first != NULL) // release the list
        {
            Node* tmp = first;
            first = tmp->next;
            delete tmp;
        }
    }

    void Produce(const T& t)
    {
        last.load()->next = new Node(t); // add the new item
        last = last.load()->next;
        while(first != divider) // trim unused nodes
        {
            Node* tmp = first;
            first = first->next;
            delete tmp;
        }
    }

    bool Consume(T& result)
    {
        if(divider != last) // if queue is nonempty
        {
            result = divider.load()->next->value; // C: copy it back
            divider = divider.load()->next;
            return true;  // and report success
        }
        return false;  // else report empty
    }
};

Some modifications to note: 需要注意的一些修改:

boost::atomic<Node*> divider;  // shared
boost::atomic<Node*> last; // shared

and

    last.load()->next = new Node(t); // add the new item
    last = last.load()->next;

and

        result = divider.load()->next->value; // C: copy it back
        divider = divider.load()->next;

Am I applying the load() (and the implicit store()) from boost::atomic correctly right here? 我是否正确地在boost :: atomic中应用了load()(以及隐式store())? Can we say this is equivalent to Sutter's original C++11 lock-free queue? 我们可以说这相当于Sutter原来的C ++ 11无锁队列吗?

PS. PS。 I studied many of the threads on SO, but none seems to provide an example for boost::atomic & lock-free queue. 我研究了SO上的许多线程,但似乎都没有为boost :: atomic&lock-free队列提供示例。

Have you tried Intel Thread Building Blocks' atomic<T> ? 你有没有试过英特尔螺纹构建模块' atomic<T> Cross platform and free. 跨平台和免费。

Also... 也...

Single producer/single consumer makes your problem much easier because your linearization point can be a single operator. 单个生产者/单个消费者使您的问题更容易,因为您的线性化点可以是单个操作员。 It becomes easier still if you are prepared to accept a bounded queue. 如果您准备接受有界队列,它会变得更容易。

A bounded queue offers advantages for cache performance because you can reserve a cache aligned memory block to maximize your hits, eg: 界队列提供缓存性能的优势,因为您可以保留缓存对齐的内存块以最大化您的命中,例如:

#include <vector>
#include "tbb/atomic.h"
#include "tbb/cache_aligned_allocator.h"    

template< typename T >
class SingleProdcuerSingleConsumerBoundedQueue { 
    typedef vector<T, cache_aligned_allocator<T> > queue_type;

public:
    BoundedQueue(int capacity):
        queue(queue_type()) {
        head = 0;
        tail = 0;
        queue.reserve(capacity);
    }

    size_t capacity() {
        return queue.capacity();
    }

    bool try_pop(T& result) {
        if(tail - head == 0)
            return false;
        else {
            result = queue[head % queue.capacity()];
            head.fetch_and_increment(); //linearization point
            return(true);
        }
    }

    bool try_push(const T& source) {
        if(tail - head == queue.capacity()) 
            return(false);
        else {
            queue[tail %  queue.capacity()] = source;
            tail.fetch_and_increment(); //linearization point
            return(true);
        }
    }

    ~BoundedQueue() {}

private:
    queue_type queue;
    atomic<int> head;
    atomic<int> tail;
};

Check out this boost.atomic ringbuffer example from the documentation: 查看文档中的boost.atomic ringbuffer示例

#include <boost/atomic.hpp>

template <typename T, size_t Size>
class ringbuffer
{
public:
    ringbuffer() : head_(0), tail_(0) {}

    bool push(const T & value)
    {
        size_t head = head_.load(boost::memory_order_relaxed);
        size_t next_head = next(head);
        if (next_head == tail_.load(boost::memory_order_acquire))
            return false;
        ring_[head] = value;
        head_.store(next_head, boost::memory_order_release);
        return true;
    }

    bool pop(T & value)
    {
        size_t tail = tail_.load(boost::memory_order_relaxed);
        if (tail == head_.load(boost::memory_order_acquire))
            return false;
        value = ring_[tail];
        tail_.store(next(tail), boost::memory_order_release);
        return true;
    }

private:
    size_t next(size_t current)
    {
        return (current + 1) % Size;
    }

    T ring_[Size];
    boost::atomic<size_t> head_, tail_;
};

// How to use    
int main()
{
    ringbuffer<int, 32> r;

    // try to insert an element
    if (r.push(42)) { /* succeeded */ }
    else { /* buffer full */ }

    // try to retrieve an element
    int value;
    if (r.pop(value)) { /* succeeded */ }
    else { /* buffer empty */ }
}

The code's only limitation is that the buffer length has to be known at compile time (or at construction time, if you replace the array by a std::vector<T> ). 代码的唯一限制是必须在编译时知道缓冲区长度(或者在构造时,如果用std::vector<T>替换数组)。 To allow the buffer to grow and shrink is not trivial, as far as I understand. 据我所知,允许缓冲区增长和缩小并非易事。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM