简体繁体 English

boost vs std原子序列一致性语义

[英]boost vs std atomic sequential consistency semantics

原文 2015-04-10 04:05:39 4 1 c++/ c++11/ boost/ atomic/ memory-model

I'd like to write a C++ lock-free object where there are many logger threads logging to a large global (non-atomic) ring buffer, with an occasional reader thread which wants to read as much data in the buffer as possible. 我想写一个C ++无锁对象，其中有许多记录器线程记录到一个大的全局（非原子）环形缓冲区，偶尔的读取器线程想要尽可能多地读取缓冲区中的数据。 I ended up having a global atomic counter where loggers get locations to write to, and each logger increments the counter atomically before writing. 我最终得到了一个全局原子计数器，记录器获取要写入的位置，每个记录器在写入之前以原子方式递增计数器。 The reader tries to read the buffer and per-logger local (atomic) variable to know whether particular buffer entries are busy being written by some logger, so as to avoid using them. 读者尝试读取缓冲区和per-logger本地（原子）变量，以了解特定缓冲区条目是否忙于由某个记录器写入，以避免使用它们。

So I have to do synchronization between a pure reader thread and many writer threads. 所以我必须在纯读者线程和许多编写器线程之间进行同步。 I sense that the problem can be solved without using locks, and I can rely on "happens after" relation to determine whether my program is correct. 我觉得问题可以在不使用锁的情况下解决，我可以依靠“发生后”关系来确定我的程序是否正确。

I've tried relaxed atomic operation, but it won't work: atomic variable stores are releases and loads are acquires, and the guarantee is that some acquire (and its subsequent work) always "happen after" some release (and its preceding work). 我已经尝试过轻松的原子操作，但它不会起作用：原子变量存储是释放和负载被获取，并且保证是某些获取（及其后续工作）总是“发生在”某些发布之后（及其之前的工作））。 That means there is no way for the reader thread (doing no store at all) to guarantee that something "happens after" the time it reads the buffer, which means I don't know whether some logger has overwritten part of the buffer when the thread is reading it. 这意味着读者线程（完全没有存储）无法保证在读取缓冲区之后“发生”某些事情，这意味着我不知道某些记录器是否覆盖了部分缓冲区线程正在读它。

So I turned to sequential consistency. 所以我转向顺序一致性。 For me, "atomic" means Boost.Atomic, which notion of sequential consistency has a "pattern" documented : 对我来说，“原子”是指Boost.Atomic，其中顺序一致性概念有一个“模式” 记载：

The third pattern for coordinating threads via Boost.Atomic uses seq_cst for coordination: If ... 通过Boost.Atomic协调线程的第三种模式使用seq_cst进行协调：如果......

thread1 performs an operation A, thread1执行操作A，

thread1 subsequently performs any operation with seq_cst, thread1随后用seq_cst执行任何操作，

thread1 subsequently performs an operation B, thread1随后执行操作B，

thread2 performs an operation C, thread2执行操作C，

thread2 subsequently performs any operation with seq_cst, thread2随后用seq_cst执行任何操作，

thread2 subsequently performs an operation D, thread2随后执行操作D，

then either "A happens-before D" or "C happens-before B" holds. 然后要么“A发生在D之前”或“C发生在B之前”。

Note that the second and fifth lines say "any operation", without saying whether it modify anything, or what it operates on. 请注意，第二行和第五行表示“任何操作”，而不说是否修改任何内容或操作内容。 This provides the guarantee that I wanted. 这提供了我想要的保证。

All is happy until I watch the talk of Herb Sutter titled "atomic<> Weapnos". 所有人都很高兴，直到我看到Herb Sutter题为“原子<> Weapnos”的谈话。 What he implies is that seq_cst is just a acq_rel, with the additional guarantee of consistent atomic stores ordering. 他暗示的是seq_cst只是一个acq_rel，具有一致的原子商店排序的额外保证。 I turned to the cppreference.com , which have similar description. 我转向cppreference.com ，它有类似的描述。

So my questions: 所以我的问题：

Does C++11 and Boost Atomic implement the same memory model? C ++ 11和Boost Atomic是否实现了相同的内存模型？
If (1) is "yes", does it mean the "pattern" described by Boost is somehow implied by the C++11 memory model? 如果（1）为“是”，是否意味着Boost所描述的“模式”以某种方式隐含在C ++ 11内存模型中？ How? 怎么样？ Or does it mean the documentation of either Boost or C++11 in cppreference is wrong? 或者它是否意味着cppreference中的Boost或C ++ 11的文档是错误的？
If (1) is "no", or (2) is "yes, but Boost documentation is incorrect", is there any way to achieve the effect I want in C++11, namely to have guarantee that (the work subsequent to) some atomic store happens after (the work preceding) some atomic load? 如果（1）是“否”，或者（2）是“是，但是Boost文档不正确”，有没有办法在C ++ 11中实现我想要的效果，即保证（后续的工作））一些原子存储发生在（前面的工作）一些原子载荷之后？

1 个解决方案

I saw no answer here, so I asked again in the Boost user mailing list. 我在这里没有看到答案，所以我再次在Boost用户邮件列表中询问。 I saw no answer there either (apart from a suggestion to look into Boost lockfree), so I planed to ask Herb Sutter (expecting no answer anyway). 我也没有看到任何答案（除了建议调查Boost lockfree），所以我计划问Herb Sutter（无论如何都没有回答）。 But before doing that, I Googled "C++ memory model" a little more deeply. 但在此之前，我用Google搜索了“C ++内存模型”。 After reading a page of Hans Boehm ( http://www.hboehm.info/c++mm/ ), I could answer most of my own question. 在阅读Hans Boehm（ http://www.hboehm.info/c++mm/ ）的页面后，我可以回答我自己的大部分问题。 I Googled a bit more, this time for "C++ Data Race", and landed at a page by Bartosz Milewski ( http://bartoszmilewski.com/2014/10/25/dealing-with-benign-data-races-the-c-way/ ). 我用Google搜索了一下，这次是为了“C ++数据竞赛”，并在Bartosz Milewski的一页上登陆（ http://bartoszmilewski.com/2014/10/25/dealing-with-benign-data-races-the- c-way / ）。 Then I can answer even more of my own question. 然后我可以回答更多我自己的问题。 Unluckily, I still don't know how to do what I want to do given that knowledge. 不幸的是，鉴于这些知识，我仍然不知道如何做我想做的事情。 Perhaps what I want to do is actually unachieveable in standard C++. 也许我想做的事实上在标准C ++中实际上是无法实现的。

My first part of the question: "Does C++11 and Boost.Atomic implement the same memory model?" 我的第一部分问题是：“C ++ 11和Boost.Atomic是否实现了相同的内存模型？” The answer is, mostly, "yes". 答案主要是“是”。 My second part of the question: "If (1) is 'yes', does it mean the "pattern" described by Boost is somehow implied by the C++11 memory model?" 问题的第二部分：“如果（1）是'是'，它是否意味着Boost描述的”模式“在某种程度上暗示了C ++ 11内存模型？” The answer is again, yes. 答案是，是的。 "How?" “怎么样？” is answered by a proof found here ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2392.html ). 通过此处的证据（ http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2392.html ）回答。 Essentially, for data race free programs, the little bit added to acq_rel is sufficient to guarantee the behavior required by seq_cst. 从本质上讲，对于无数据竞争的程序，添加到acq_rel的一点点就足以保证seq_cst所需的行为。 So both documentation, although perhaps confusing, are correct. 所以这两个文档虽然可能令人困惑，但都是正确的。

Now the real problem: although both (1) and (2) get "yes" answers, my original program is wrong! 现在真正的问题是：虽然（1）和（2）都得到“是”答案，但我的原始程序是错误的！ I neglected (actually, I'm unaware of) an important rule of C++: a program with data race has undefined behavior (rather than an "unspecified" or "implementation defined" one). 我忽略了（实际上，我不知道）C ++的一个重要规则：具有数据竞争的程序具有未定义的行为（而不是“未指定”或“实现定义”行为）。 That is, the compiler guarantees behavior of my program only if my program has absolutely no data race. 也就是说，只有当我的程序完全没有数据争用时，编译器才会保证程序的行为。 Without a lock, my program contains a data race: the pure reader thread can read any time, even at a time when the logger thread is busy writing. 没有锁定，我的程序包含数据竞争：纯读取器线程可以随时读取，即使在记录器线程忙于写入时也是如此。 This is "undefined behavior", and the rule says that the computer can do anything (the "catch fire" rule). 这是“未定义的行为”，规则说计算机可以做任何事情（“火灾”规则）。 To fix it, one has to use ideas found in the page of Bartosz Milewski I mentioned earlier, ie, change the ring buffer to contain only atomic content, so that the compiler knows that its ordering is important and must not be reordered with the operations marked to require sequential consistency. 要修复它，必须使用我前面提到的Bartosz Milewski页面中的想法，即更改环形缓冲区以仅包含原子内容，以便编译器知道它的顺序很重要，不能与操作重新排序标记为需要顺序一致性。 If overhead minimization is desired, one can write to it using relaxed atomic operations. 如果需要开销最小化，可以使用放松的原子操作写入它。

Unluckily, this applies to the reader thread too. 不幸的是，这也适用于读者线程。 I can no longer just "memcpy" the whole memory buffer. 我不能只是“memcpy”整个内存缓冲区。 Instead I must also use relaxed atomic operations to read the buffer, one word after another. 相反，我还必须使用宽松的原子操作来一个接一个地读取缓冲区。 This kills performance, but I have no choice actually. 这会导致性能下降，但我实际上别无选择。 Luckily for me, the dumper's performance is not important to me at all: it rarely gets run anyway. 幸运的是，对我来说，自卸车的性能对我来说并不重要：无论如何它很少运行。 But if I do want the performance of "memcpy", I would get an answer of "no solution": C++ provides no semantics of "I know there is data race, you can return anything to me here but don't screw up my program". 但是，如果我想要“memcpy”的表现，我会得到一个“没有解决方案”的答案：C ++没有提供“我知道有数据竞争的语义，你可以在这里向我返回任何内容，但不要搞砸我的程序”。 Either you ensure that there is no data race and pay the cost to get everything well defined, or you have a data race and the compiler is allowed to put you to jail. 要么你确保没有数据竞争并支付成本来定义一切，或者你有一个数据竞争，并允许编译器让你入狱。