简体繁体 English

x86轻松订购性能？

[英]x86 relaxed ordering performance?

原文 2015-01-20 16:08:41 1 2 c++/ c++11/ x86/ memory-model/ stdatomic

Since intel provides strong hardware memory model, is there any advantage at all to using "memory_order_relaxed" in a C++11 program? 由于intel提供了强大的硬件内存模型，在C ++ 11程序中使用“memory_order_relaxed”是否有任何优势？ Or just leave it at default "sequential consistent" since it makes no difference? 或者只是保持默认的“顺序一致”，因为它没有区别？

2 个解决方案

Like most answers in computer science, the answer to this is "that depends." 像计算机科学中的大多数答案一样，答案就是“这取决于”。

First of all, the idea that sequentially consistent ordering never carries any penalty is incorrect. 首先，顺序一致排序永远不会带来任何惩罚的想法是不正确的。 Depending on your code (and possibly compiler), it can and will carry a penalty. 根据您的代码（以及可能的编译器），它可以并且将带来惩罚。

Second, to make intelligent decisions about the memory ordering constraints, you need to think about (and understand) how you're using the data involved. 其次，要对内存排序约束做出明智的决策，您需要考虑（并理解）您如何使用所涉及的数据。

memory_order_relaxed is useful for something like a standalone counter that needs to be atomic, but isn't directly related to something else so it doesn't need to be consistent with any "something else". memory_order_relaxed对于像独立计数器这样的东西很有用，它需要是原子的，但与其他东西没有直接关系，所以它不需要与任何“其他东西”保持一致。 The typical example would be a reference count, such as in shared_ptr or some older implementations of std::string . 典型示例是引用计数，例如在shared_ptr或std::string一些较旧实现中。 In this case, we just need to assure that the counter is incremented and decremented atomically, and that modifications to it are visible to all threads. 在这种情况下，我们只需要确保计数器以原子方式递增和递减，并且对它的修改对所有线程都是可见的。 But, particularly, there's not any related data with which it needs to remain consistent, so we don't care much about it's ordering with respect to anything else. 但是，特别是，没有任何相关数据需要保持一致，因此我们不关心它对其他任何事物的排序。

Sequentially Consistent ordering is pretty much at the opposite extreme. 顺序一致的排序几乎是相反的极端。 It's probably the easiest to apply--you write the code just about like it was single threaded, and the implementation assures that it works correctly (that's not to say you don't have to take threading into account at all, but sequentially consistent ordering generally requires the least thought about it, but is also generally the slowest model). 它可能是最容易应用的 - 你编写的代码就像单线程一样，并且实现确保它正常工作（这并不是说你根本不需要考虑线程，而是顺序一致的排序）通常需要最少的考虑，但通常也是最慢的模型）。

Acquire/release consistency are normally used when you have two or more related pieces of information, and you need to assure that one only becomes visible before/after the other. 当您有两条或更多相关信息时，通常会使用获取/发布一致性，并且您需要确保只有一条信息在另一条信息之前/之后变得可见。 For one example that I dealt with recently, let's assume you're building something roughly like an in-memory database. 对于我最近处理的一个例子，让我们假设你正在构建一个大致类似于内存数据库的东西。 You have some data, and you have some metadata (and you're storing each more or less separately). 你有一些数据，并且你有一些元数据（并且你或多或少地分别存储）。

The metadata is used (among other things) for searching the database. 使用元数据（以及其他内容）来搜索数据库。 We want to assure that if somebody finds some particular data that the data they found will actually be present in the database. 我们想确保如果有人发现某些特定数据，他们发现的数据实际上会存在于数据库中。

To assure this, we want to assure that the data is always present before the metadata and continues to exist at least as long as the metadata. 为了确保这一点，我们希望确保数据始终存在于元数据之前，并且至少与元数据一样长。 The database would be inconsistent if somebody could search the database using the metadata, and find some data it wants to use, when that data isn't actually present. 如果某人可以使用元数据搜索数据库，并且在该数据实际不存在时找到它想要使用的数据，则数据库将是不一致的。

So in this case, when we're adding a record, we need to assure that we add the data first, then add the metadata--and the compiler must not rearrange the two. 所以在这种情况下，当我们添加记录时，我们需要确保首先添加数据，然后添加元数据 - 编译器不能重新排列这两个。 Likewise, when we're deleting a record, we need to delete the metadata (so nobody will find the data), then delete the data itself. 同样，当我们删除记录时，我们需要删除元数据（因此没有人会找到数据），然后删除数据本身。 In the case of the data itself, chances are we have a reference count to keep track of how many clients are currently accessing that data, to assure that we don't delete it while somebody is trying to use it. 对于数据本身，我们可能有一个引用计数来跟踪当前访问该数据的客户端数量，以确保在有人尝试使用它时我们不会删除它。

So in this case, we can use acquire/release semantics for the metadata and data, and relaxed ordering for the reference count. 因此，在这种情况下，我们可以对元数据和数据使用获取/释放语义，并放宽引用计数的顺序。 Or, if we want to keep our code as simple as possible, we could use sequential consistency throughout--even though it might (and probably will) carry at least some penalty. 或者，如果我们想让代码尽可能简单，我们可以在整个过程中使用顺序一致性 - 即使它可能（并且可能会）至少带来一些惩罚。

Always use the minimum guarantees you need to make your code correct. 始终使用您所需的最低保证来使您的代码正确无误。

No more, and no less. 不多也不少。

That way, you can avoid any unneccessary dependencies on the implementation, thus reducing any porting costs, and will still get the fastest program possible. 这样，您可以避免对实现的任何不必要的依赖性，从而减少任何移植成本，并仍然可以获得最快的程序。

Of course, if you are sure you won't ever care about porting any of your code, taking stronger guarantees where you know it won't matter on your platforms may make prooving it correct easier. 当然，如果您确定不会关心移植任何代码，那么在您知道平台无关紧要的情况下采取更有力的保证可能会使其更容易正确。
Being harder to misuse, easier to reason about or shorter are perfectly accepted reasons for using less performant constructs too. 更难以滥用，更容易推理或更短是使用性能较低的结构的完全可接受的原因。