简体   繁体   中英

How to mix atomic and non-atomic operations in C++?

The std::atomic types allow atomic access to variables, but I would sometimes like non-atomic access, for example when the access is protected by a mutex. Consider a bitfield class that allows both multi-threaded access (via insert) and single-threaded vectorized access (via operator|=):

class Bitfield
{
    const size_t size_, word_count_;
    std::atomic<size_t> * words_;
    std::mutex mutex_;

public:

    Bitfield (size_t size) :
        size_(size),
        word_count_((size + 8 * sizeof(size_t) - 1) / (8 * sizeof(size_t)))
    {
        // make sure words are 32-byte aligned
        posix_memalign(&words_, 32, word_count_ * sizeof(size_t));
        for (int i = 0; i < word_count_; ++i) {
            new(words_ + i) std::atomic<size_t>(0);
        }
    }
    ~Bitfield () { free(words_); }

private:
    void insert_one (size_t pos)
    {
        size_t mask = size_t(1) << (pos % (8 * sizeof(size_t)));
        std::atomic<size_t> * word = words_ + pos / (8 * sizeof(size_t));
        word->fetch_or(mask, std::memory_order_relaxed);
    }
public:
    void insert (const std::set<size_t> & items)
    {
        std::lock_guard<std::mutex> lock(mutex_);
        // do some sort of muti-threaded insert, with TBB or #pragma omp
        parallel_foreach(items.begin(), items.end(), insert_one);
    }

    void operator |= (const Bitfield & other)
    {
        assert(other.size_ == size_);
        std::unique_lock<std::mutex> lock1(mutex_, defer_lock);
        std::unique_lock<std::mutex> lock2(other.mutex_, defer_lock);
        std::lock(lock1, lock2); // edited to lock other_.mutex_ as well
        // allow gcc to autovectorize (256 bits at once with AVX)
        static_assert(sizeof(size_t) == sizeof(std::atomic<size_t>), "fail");
        size_t * __restrict__ words = reinterpret_cast<size_t *>(words_);
        const size_t * __restrict__ other_words
            = reinterpret_cast<const size_t *>(other.words_);
        for (size_t i = 0, end = word_count_; i < end; ++i) {
            words[i] |= other_words[i];
        }
    }
};

Note operator|= is very close to what's in my real code, but insert(std::set) is just attempting to capture the idea that one can

acquire lock;
make many atomic accesses in parallel;
release lock;

My question is this: what is the best way to mix such atomic and non-atomic access? Answers to [1,2] below suggest that casting is wrong (and I agree). But surely the standard allows such apparently safe access?

More generally, can one use a reader-writer-lock and allow "readers" to read and write atomically, and the unique "writer" to read and write non-atomically?

References

  1. How to use std::atomic efficiently
  2. Accessing atomic<int> of C++0x as non-atomic

Standard C++ prior to C++11 had no multithreaded memory model. I see no changes in the standard that would define the memory model for non-atomic accesses, so those get similar guarantees as in a pre-C++11 environment.

It is actually theoretically even worse than using memory_order_relaxed , because the cross thread behavior of non-atomic accesses is simply completely undefined as opposed to multiple possible orders of execution one of which must eventually happen.

So, to implement such patterns while mixing atomic and non-atomic accesses, you will still have to rely on platform specific non-standard constructs (for example, _ReadBarrier ) and/or intimate knowledge of particular hardware.

A better alternative is to get familiar with the memory_order enum and hope to achieve optimum assembly output with a given piece of code and compiler. The end result may be correct, portable, and contain no unwanted memory fences, but you should expect to disassemble and analyze several buggy versions first, if you are like me; and there will still be no guarantee that the use of atomic accesses on all code paths will not result in some superfluous fences on a different architecture or a different compiler.

So the best practical answer is simplicity first. Design your cross-thread interactions as simple as you can make it without completely killing scalability, responsiveness or any other holy cow; have nearly no shared mutable data structures; and access them as rarely as you can, always atomically.

If you could do this, you'd have (potentially) one thread reading/writing a data object using atomic accesses and another thread reading/writing the same data object without using atomic accesses. That's a data race, and the behavior would be undefined.

In C++20 there is std::atomic_ref , which allows atomic operations on non-atomic data.

So you should be able to declare words_ as non-atomic size_t* and use std::atomic_ref<size_t> to do atomic operations when needed. But be aware of the requirements:

While any atomic_ref instances referencing an object exists, the object must be exclusively accessed through these atomic_ref instances. No subobject of an object referenced by an atomic_ref object may be concurrently referenced by any other atomic_ref object.

upd: In this particular case you probably also need std::shared_mutex to separate atomic "reader's" modifications from non-atomic "writer's" modifications.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM