关于shared_ptr析构函数中的实现错误的困惑

Question

I have just seen Herb Sutter's talk: C++ and Beyond 2012: Herb Sutter - atomic<> Weapons, 2 of 2 我刚刚看过Herb Sutter的演讲： C ++和2012年之后：Herb Sutter - 原子<>武器，2 of 2

He shows bug in implementation of std::shared_ptr destructor: 他在std :: shared_ptr析构函数的实现中显示了错误：

if( control_block_ptr->refs.fetch_sub(1, memory_order_relaxed ) == 0 )
    delete control_block_ptr; // B

He says, that due to memory_order_relaxed, delete can be placed before fetch_sub. 他说，由于memory_order_relaxed，删除可以放在fetch_sub之前。

At 1:25:18 - Release doesn't keep line B below, where it should be 在1:25:18 - 释放不保持在B线下面，它应该在哪里

How that is possible? 怎么可能？ There is happens-before / sequenced-before relationship, because they are both in single thread. 在关系之前发生 - 之前/顺序 - 因为它们都在单线程中。 I might be wrong, but there is also carries-a-dependency-to between fetch_sub and delete. 我可能错了，但fetch_sub和delete之间也存在一个依赖关系。

If he is right, which ISO items support that? 如果他是对的，哪些ISO项目支持？

Answer 1

Imagine a code that releases a shared pointer: 想象一下释放共享指针的代码：

auto tmp = &(the_ptr->a);
*tmp = 10;
the_ptr.dec_ref();

If dec_ref() doesn't have a "release" semantic, it's perfectly fine for a compiler (or CPU) to move things from before dec_ref() to after it (for example): 如果dec_ref（）没有“释放”语义，那么编译器（或CPU）将事物从dec_ref（）之前移动到之后（例如）就完全没问题了：

auto tmp = &(the_ptr->a);
the_ptr.dec_ref();
*tmp = 10;

And this is not safe, since dec_ref() also can be called from other thread in the same time and delete the object. 这是不安全的，因为dec_ref（）也可以在同一时间从其他线程调用并删除该对象。 So, it must have a "release" semantic for things before dec_ref() to stay there. 因此，在dec_ref（）之前，它必须具有“释放”语义才能保留在那里。

Lets now imagine that object's destructor looks like this: 让我们现在想象对象的析构函数如下所示：

~object() {
    auto xxx = a;
    printf("%i\n", xxx);
}

Also we will modify example a bit and will have 2 threads: 此外，我们将修改一些示例，并将有2个线程：

// thread 1
auto tmp = &(the_ptr->a);
*tmp = 10;
the_ptr.dec_ref();

// thread 2
the_ptr.dec_ref();

Then, the "aggregated" code will look like: 然后，“聚合”代码将如下所示：

// thread 1
auto tmp = &(the_ptr->a);
*tmp = 10;
{ // the_ptr.dec_ref();
    if (0 == atomic_sub(...)) {
        { //~object()
            auto xxx = a;
            printf("%i\n", xxx);
        }
    }
}

// thread 2
{ // the_ptr.dec_ref();
    if (0 == atomic_sub(...)) {
        { //~object()
            auto xxx = a;
            printf("%i\n", xxx);
        }
    }
}

However, if we only have a "release" semantic for atomic_sub(), this code can be optimized that way: 但是，如果我们只为atomic_sub（）提供“释放”语义，则可以通过以下方式优化此代码：

// thread 2
auto xxx = the_ptr->a; // "auto xxx = a;" from destructor moved here
{ // the_ptr.dec_ref();
    if (0 == atomic_sub(...)) {
        { //~object()
            printf("%i\n", xxx);
        }
    }
}

But that way, destructor will not always print the last value of "a" (this code is not race free anymore). 但是这样，析构函数不会总是打印“a”的最后一个值（此代码不再是无竞争的）。 That's why we also need acquire semantic for atomic_sub (or, strictly speaking, we need an acquire barrier when counter becomes 0 after decrement). 这就是为什么我们还需要为atomic_sub获取语义（或者，严格来说，当计数器在递减之后变为0时我们需要获取障碍）。

Answer 2

This is a late reply. 这是迟到的回复。

Let's start out with this simple type: 让我们从这个简单的类型开始：

struct foo
{
    ~foo() { std::cout << value; }
    int value;
};

And we'll use this type in a shared_ptr , as follows: 我们将在shared_ptr使用此类型，如下所示：

void runs_in_separate_thread(std::shared_ptr<foo> my_ptr)
{
    my_ptr->value = 5;
    my_ptr.reset();
}

int main()
{
    std::shared_ptr<foo> my_ptr(new foo);
    std::async(std::launch::async, runs_in_separate_thread, my_ptr);
    my_ptr.reset();
}

Two threads will be running in parallel, both sharing ownership of a foo object. 两个线程将并行运行，两者共享foo对象的所有权。

With a correct shared_ptr implementation (that is, one with memory_order_acq_rel ), this program has defined behavior. 使用正确的shared_ptr实现（即具有memory_order_acq_rel ），此程序已定义行为。 The only value that this program will print is 5 . 该程序打印的唯一值是5 。

With an incorrect implementation (using memory_order_relaxed ) there are no such guarantees. 使用不正确的实现（使用memory_order_relaxed ）没有这样的保证。 The behavior is undefined because a data race of foo::value is introduced. 行为未定义，因为引入了foo::value的数据争用。 The trouble occurs only for cases when the destructor gets called in the main thread. 只有在主线程中调用析构函数的情况下才会出现问题。 With a relaxed memory order, the write to foo::value in the other thread may not propagate to the destructor in the main thread. 使用宽松的内存顺序，另一个线程中对foo::value的写入可能不会传播到主线程中的析构函数。 A value other than 5 could be printed. 可以打印5以外的值。

So what's a data race? 什么是数据竞赛？ Well, check out the definition and pay attention to the last bullet point: 那么，看看定义并注意最后一个要点：

When an evaluation of an expression writes to a memory location and another evaluation reads or modifies the same memory location, the expressions are said to conflict. 当表达式的评估写入内存位置而另一个评估读取或修改相同的内存位置时，表达式会发生冲突。 A program that has two conflicting evaluations has a data race unless either 具有两个冲突评估的程序具有数据竞争，除非两者之一

both conflicting evaluations are atomic operations (see std::atomic) 两个冲突的评估都是原子操作（参见std :: atomic）

one of the conflicting evaluations happens-before another (see std::memory_order) 其中一个冲突的评估发生在另一个之前（参见std :: memory_order）

In our program, one thread will write to foo::value and one thread will read from foo::value . 在我们的程序中，一个线程将写入foo::value ，一个线程将从foo::value读取。 These are supposed to be sequential; 这些应该是顺序的; the write to foo::value should always happen before the read. 写入foo::value应始终在读取之前发生。 Intuitively, it makes sense that they would be as the destructor is supposed to be the last thing that happens to an object. 直觉上，它们是有意义的，因为析构函数应该是对象发生的最后一件事。

memory_order_relaxed does not offer such ordering guarantees though and so memory_order_acq_rel is required. memory_order_relaxed不提供这样的排序保证，因此需要memory_order_acq_rel 。

Answer 3

In the talk Herb shows memory_order_release not memory_order_relaxed , but relaxed would have even more problems. 在谈话中，Herb显示memory_order_release而不是memory_order_relaxed ，但放松会有更多问题。

Unless delete control_block_ptr accesses control_block_ptr->refs (which it probably doesn't) then the atomic operation does not carry-a-dependency-to the delete. 除非delete control_block_ptr访问control_block_ptr->refs （它可能没有），否则原子操作不会对删除进行依赖。 The delete operation might not touch any memory in the control block, it might just return that pointer to the freestore allocator. 删除操作可能不会触及控制块中的任何内存，它可能只是将该指针返回到freestore分配器。

But I'm not sure if Herb is talking about the compiler moving the delete before the atomic operation, or just referring to when the side effects become visible to other threads. 但是我不确定Herb是否在谈论编译器在原子操作之前移动删除，或者只是指副作用何时对其他线程可见。

Answer 4

Looks like he is talking about synchronization of actions on shared object itself, which are not shown on his code blocks (and as the result - confusing). 看起来他正在谈论共享对象本身的动作同步，这些动作没有显示在他的代码块上（并且结果 - 令人困惑）。

That's why he put acq_rel - because all actions on the object should happens before its destruction, all in order. 这就是他放置acq_rel的原因 - 因为对象上的所有动作都应该在它被破坏之前发生，一切都按顺序进行。

But I'm still not sure why he talks about swapping delete with fetch_sub . 但我仍然不确定为什么他会谈到用fetch_sub交换delete 。

关于shared_ptr析构函数中的实现错误的困惑

问题描述

4 个解决方案

解决方案1
1 2015-02-10 01:07:03

解决方案2
1 2016-07-07 06:57:45

解决方案3
0 2013-02-14 18:53:16

解决方案4
0 2013-02-14 19:23:10

关于shared_ptr析构函数中的实现错误的困惑

问题描述

4 个解决方案

解决方案1 1 2015-02-10 01:07:03

解决方案2 1 2016-07-07 06:57:45

解决方案3 0 2013-02-14 18:53:16

解决方案4 0 2013-02-14 19:23:10

解决方案1
1 2015-02-10 01:07:03

解决方案2
1 2016-07-07 06:57:45

解决方案3
0 2013-02-14 18:53:16

解决方案4
0 2013-02-14 19:23:10