[英]How to mix atomic and non-atomic operations in C++?
The std::atomic types allow atomic access to variables, but I would sometimes like non-atomic access, for example when the access is protected by a mutex. std::atomic 类型允许对变量进行原子访问,但有时我想要非原子访问,例如当访问受互斥锁保护时。 Consider a bitfield class that allows both multi-threaded access (via insert) and single-threaded vectorized access (via operator|=):
考虑一个允许多线程访问(通过插入)和单线程向量化访问(通过运算符|=)的位域类:
class Bitfield
{
const size_t size_, word_count_;
std::atomic<size_t> * words_;
std::mutex mutex_;
public:
Bitfield (size_t size) :
size_(size),
word_count_((size + 8 * sizeof(size_t) - 1) / (8 * sizeof(size_t)))
{
// make sure words are 32-byte aligned
posix_memalign(&words_, 32, word_count_ * sizeof(size_t));
for (int i = 0; i < word_count_; ++i) {
new(words_ + i) std::atomic<size_t>(0);
}
}
~Bitfield () { free(words_); }
private:
void insert_one (size_t pos)
{
size_t mask = size_t(1) << (pos % (8 * sizeof(size_t)));
std::atomic<size_t> * word = words_ + pos / (8 * sizeof(size_t));
word->fetch_or(mask, std::memory_order_relaxed);
}
public:
void insert (const std::set<size_t> & items)
{
std::lock_guard<std::mutex> lock(mutex_);
// do some sort of muti-threaded insert, with TBB or #pragma omp
parallel_foreach(items.begin(), items.end(), insert_one);
}
void operator |= (const Bitfield & other)
{
assert(other.size_ == size_);
std::unique_lock<std::mutex> lock1(mutex_, defer_lock);
std::unique_lock<std::mutex> lock2(other.mutex_, defer_lock);
std::lock(lock1, lock2); // edited to lock other_.mutex_ as well
// allow gcc to autovectorize (256 bits at once with AVX)
static_assert(sizeof(size_t) == sizeof(std::atomic<size_t>), "fail");
size_t * __restrict__ words = reinterpret_cast<size_t *>(words_);
const size_t * __restrict__ other_words
= reinterpret_cast<const size_t *>(other.words_);
for (size_t i = 0, end = word_count_; i < end; ++i) {
words[i] |= other_words[i];
}
}
};
Note operator|= is very close to what's in my real code, but insert(std::set) is just attempting to capture the idea that one can注意 operator|= 非常接近我的真实代码中的内容,但是 insert(std::set) 只是试图捕捉人们可以
acquire lock;
make many atomic accesses in parallel;
release lock;
My question is this: what is the best way to mix such atomic and non-atomic access?我的问题是:混合这种原子访问和非原子访问的最佳方法是什么? Answers to [1,2] below suggest that casting is wrong (and I agree).
下面对 [1,2] 的回答表明转换是错误的(我同意)。 But surely the standard allows such apparently safe access?
但该标准肯定允许这种看似安全的访问吗?
More generally, can one use a reader-writer-lock and allow "readers" to read and write atomically, and the unique "writer" to read and write non-atomically?更一般地说,是否可以使用读写锁并允许“读者”以原子方式读写,而唯一的“作家”可以非原子地读写?
Standard C++ prior to C++11 had no multithreaded memory model. C++11 之前的标准 C++ 没有多线程内存模型。 I see no changes in the standard that would define the memory model for non-atomic accesses, so those get similar guarantees as in a pre-C++11 environment.
我没有看到为非原子访问定义内存模型的标准中的任何更改,因此这些标准与在 C++11 之前的环境中获得了类似的保证。
It is actually theoretically even worse than using memory_order_relaxed
, because the cross thread behavior of non-atomic accesses is simply completely undefined as opposed to multiple possible orders of execution one of which must eventually happen.它实际上在理论上甚至比使用
memory_order_relaxed
更糟糕,因为非原子访问的跨线程行为是完全未定义的,而不是多个可能的执行顺序,其中之一最终必须发生。
So, to implement such patterns while mixing atomic and non-atomic accesses, you will still have to rely on platform specific non-standard constructs (for example, _ReadBarrier
) and/or intimate knowledge of particular hardware.因此,要在混合原子访问和非原子访问的同时实现此类模式,您仍然必须依赖平台特定的非标准构造(例如
_ReadBarrier
)和/或对特定硬件的深入了解。
A better alternative is to get familiar with the memory_order
enum and hope to achieve optimum assembly output with a given piece of code and compiler.更好的选择是熟悉
memory_order
枚举,并希望使用给定的代码和编译器实现最佳汇编输出。 The end result may be correct, portable, and contain no unwanted memory fences, but you should expect to disassemble and analyze several buggy versions first, if you are like me;最终结果可能是正确的、可移植的,并且不包含不需要的内存栅栏,但是如果你像我一样,你应该首先反汇编和分析几个有问题的版本; and there will still be no guarantee that the use of atomic accesses on all code paths will not result in some superfluous fences on a different architecture or a different compiler.
并且仍然不能保证在所有代码路径上使用原子访问不会在不同的体系结构或不同的编译器上产生一些多余的围栏。
So the best practical answer is simplicity first.所以最好的实际答案是简单第一。 Design your cross-thread interactions as simple as you can make it without completely killing scalability, responsiveness or any other holy cow;
尽可能简单地设计您的跨线程交互,而不会完全破坏可扩展性、响应性或任何其他神牛; have nearly no shared mutable data structures;
几乎没有共享的可变数据结构; and access them as rarely as you can, always atomically.
并尽可能少地访问它们,始终以原子方式访问。
If you could do this, you'd have (potentially) one thread reading/writing a data object using atomic accesses and another thread reading/writing the same data object without using atomic accesses.如果你能做到这一点,你将(可能)有一个线程使用原子访问读取/写入数据对象,而另一个线程不使用原子访问读取/写入相同的数据对象。 That's a data race, and the behavior would be undefined.
这是一场数据竞赛,行为将是未定义的。
In C++20 there is std::atomic_ref
, which allows atomic operations on non-atomic data.在 C++20 中有
std::atomic_ref
,它允许对非原子数据进行原子操作。
So you should be able to declare words_
as non-atomic size_t*
and use std::atomic_ref<size_t>
to do atomic operations when needed.因此,您应该能够将
words_
声明为非原子size_t*
并在需要时使用std::atomic_ref<size_t>
进行原子操作。 But be aware of the requirements:但请注意以下要求:
While any atomic_ref instances referencing an object exists, the object must be exclusively accessed through these atomic_ref instances.
虽然存在引用对象的任何 atomic_ref 实例,但必须通过这些 atomic_ref 实例独占访问该对象。 No subobject of an object referenced by an atomic_ref object may be concurrently referenced by any other atomic_ref object.
atomic_ref 对象引用的对象的任何子对象都不能同时被任何其他 atomic_ref 对象引用。
upd: In this particular case you probably also need std::shared_mutex
to separate atomic "reader's" modifications from non-atomic "writer's" modifications. upd:在这种特殊情况下,您可能还需要
std::shared_mutex
将原子“读者”修改与非原子“作者”修改分开。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.