简体   繁体   English

以下代码中会有赛车情况吗?

[英]could there be a racing situation in the following code?

We know the destructor code below is supposed to release the control block if this is the last smart_ptr pointing to the resource being managed. 我们知道,如果这是指向被管理资源的最后一个smart_ptr,则下面的析构函数代码应该释放控制块。 Is it possible that we have a racing problem between the "if" and the "delete" below? 在下面的“ if”和“ delete”之间是否存在赛车问题? what if we try to create a brand new smart_ptr obj in a different thread right right AFTER the "if" and BEFORE "delete"? 如果我们尝试在“ if”之后和“ delete”之前在另一个线程中创建一个全新的smart_ptr obj,该怎么办?

// Thread D:
// smart_ptr destructor
~smart_ptr() {
  if (control_block_ptr->refs.fetch_sub(1, memory_order_acq_rel) == 0) {
    delete control_block_ptr;
  }
}

First, a correction: 首先,更正:

 control_block_ptr->refs.fetch_sub(1, memory_order_acq_rel) == 0 

You mean == 1 . 您的意思是== 1 fetch_sub returns the value before the subtraction, not afterwards. fetch_sub返回减法之前的值,而不是此后的值。

That having been cleared up: 已清除的内容:

Is it possible that we have a racing problem between the "if" and the "delete" below? 在下面的“ if”和“ delete”之间是否存在赛车问题? what if we try to create a brand new smart_ptr obj in a different thread right right AFTER the "if" and BEFORE "delete"? 如果我们尝试在“ if”之后和“ delete”之前在另一个线程中创建一个全新的smart_ptr obj,该怎么办?

So, you're destroying a smart_ptr object. 因此,您正在破坏smart_ptr对象。 If control_block_ptr->refs is 1, then that means that the current smart_ptr is the only object which owns the control block, right? 如果control_block_ptr->refs为1,则意味着当前smart_ptr是拥有控制块的唯一对象 ,对吗? So what are you concerned about? 那您在担心什么呢?

After all, some "brand new smart_ptr " will have its own "brand new" control_block_ptr and its own reference count. 毕竟,某些“全新smart_ptr ”将具有自己的“全新” control_block_ptr和其引用计数。 That will in no way interfere with the one being destroyed. 那绝不会干扰被摧毁的那一个。

The only way there could be a problem is if there is a race between the destruction of a smart_ptr and the act of copying it. 可能存在问题的唯一方法是,在销毁smart_ptr与复制行为之间存在竞争。 And by "it", I mean literally the same object ; 我所说 “它”实际上是指同一对象 not just any smart_ptr , but the same unique owner of this control block that is being destroyed. 不仅是任何smart_ptr ,而且是要销毁的此控制块的相同唯一所有者。

See, if you had two smart_ptr objects sharing the same state, copying one while destroying the other is fine, because regardless of the order of those operations, everything works out. 看到,如果您有两个smart_ptr对象共享相同的状态,则在销毁另一个对象的同时复制一个对象就可以了,因为无论这些操作的顺序如何,一切都可以解决。 One of them will decrement the reference count; 其中之一将减少参考计数。 the other will increment it. 另一个会增加它。 But because the reference count started at 2, the reference count never hits 0. 但是因为引用计数从2开始,所以引用计数永远不会达到0。

But if if you're copying the same object you're destroying... well, that's just broken code. 但是,如果要复制的对象是要销毁的对象,那么……那只是断的代码。 Note that while std::shared_ptr specifically states that updates to the reference count in the shared state are atomic and do not cause data races, multiple accesses to the same shared_ptr object from different threads are a data race (and thus undefined behavior). 请注意,尽管std::shared_ptr特别声明了在共享状态下对引用计数的更新是原子的,并且不会引起数据争用,但从不同线程对同一shared_ptr对象的多次访问却是数据争用(因此,行为未定义)。 So long as at least one of the accesses is not a const operation; 只要访问中的至少一个不是const操作; the copy is a const operation, but the destruction is not, so it applies. 该副本是const操作,但销毁不是,因此适用。

The same will have to be true of your smart_ptr : if you attempt to copy the same object you're destroying, then bad things happen. 您的smart_ptr也必须如此:如果您尝试复制要销毁的同一对象,则会发生坏事。

Generally (at least for boost and standard library smart pointers) smart pointers objects themselves are not designed to be threadsafe. 通常(至少对于Boost和标准库智能指针而言),智能指针对象本身并不设计为线程安全的。 Only the management/lifetime of the object that they point to is threadsafe. 他们指向的对象的管理/生命周期只有线程安全的。 The smart_ptr object itself is not safe to use in multiple threads simultaneously, but it is safe to have multiple different smart_ptr objects referring to the same base data, that are all being used simultaneously. smart_ptr对象本身并不安全地同时在多个线程中使用,但是具有多个引用同一基础数据的不同smart_ptr对象同时使用是安全的。

In your example, the "brand new smart_ptr" that is being created in a different thread must be being copied from some existing smart_ptr. 在您的示例中,必须从其他现有的smart_ptr复制在不同线程中创建的“全新smart_ptr”。

If that existing smart_ptr is not the one being destructed, it will make sure that the if branch will never be taken in the destructor, since it will keep the object alive. 如果现有的smart_ptr不是要销毁的对象,它将确保不会在析构函数中使用if分支,因为它将使对象保持活动状态。

If that existing smart_ptr is the one that is being destructed in your example, then you have a problem, but this is because you are attempting to use a smart_ptr object that is in the middle of being destroyed. 如果现有的smart_ptr是您的示例中要销毁的对象,则您有问题,但这是因为您正尝试使用正在销毁过程中的smart_ptr对象。 Even if the destructor was race free in this scenario, there would still be the possibility that the other thread would continue using the smart_ptr after it had been destoryed, which is always illegal in C++. 即使在这种情况下析构函数没有竞争,其他线程仍然有可能在被销毁后继续使用smart_ptr,这在C ++中始终是非法的。

Simply put, no. 简单地说,不。

A race condition, that an end result dependent on a particular ordering of operations, is only possible on a reference count an operation can make an object that contributes to the count from one that doesn't , that is, when you have weak references . 一种竞争条件,即最终结果取决于操作的特定顺序, 仅在引用计数上才有可能,操作可以使一个对象从一个没有引用的对象(即当您具有弱引用时)对计数产生贡献

Here what counts as a racing endpoint is whether or not the resource is released because the refcount (RC) reaches zero; 在这里,竞赛点是争用终结点,即是否由于refcount(RC)达到零而释放资源。 the question "which exact operation, performed by which thread, makes the RC zero" is an interesting endpoint: the implicit assumption in the use of RC for managing ressources in a multithreading context is that any thread (which has the last owner) can release the resource. “哪个线程执行的确切操作使RC为零”这个问题是一个有趣的端点:在多线程上下文中使用RC管理资源时的隐含假设是,任何线程(拥有最后一个所有者)都可以释放资源。

By definition a RC is the sum of individual strictly positive contributions of each owner to the RC (which happen to be 1 as the RC is the number of owners but that isn't very important). 根据定义,RC是每个所有者对RC的严格贡献的总和(由于RC是所有者的数量,所以恰好为1,但这并不是很重要)。 In an abstract setting, the RC could also be formalized as the set of owners, and an integer would be an efficient representation of the information needed, because of the specifics of RC: 在抽象的情况下,也可以将RC形式化为所有者的集合,并且由于RC的特殊性,整数将是所需信息的有效表示:

  • each owner knows that it's in the set 每个所有者都知道它在集合中
  • owners don't know each others 业主彼此不认识
  • each owner needs to increase the number by some amount (which is 1 by definition, but could be any strictly positive number) on being added to the set and decrease it by the same amount when it removes itself from the set 每个所有者在添加到集合中时需要增加一定数量(根据定义为1,但可以是任何严格的正数),并在从集合中删除自身时将其减少相同的数量

So you can imagine the number as essentially a list of owners, each being represented by a vertical line as when children learn numbers (3 = ||| ), and only an individual owner knows its own bar (you can either say all bars are the same or they have distinct colors). 因此,您可以将数字想象成一个所有者列表,每个所有者都由一条垂直线表示,就像孩子学习数字时(3 = ||| ),并且只有一个所有者知道自己的酒吧(您可以说所有酒吧都是相同或它们具有不同的颜色)。 (The integers obviously are physically represented in binary.) (这些整数显然用二进制表示。)

In setups where only owners exist (no operation can refer to a RC through a "weak reference") there are only two basic operations on the set of owners: 在仅所有者存在的设置中(没有任何操作可以通过“弱引用”引用RC),所有者集中只有两个基本操作:

  • duplicate an owner: make a new owner from an owner 复制所有者:从所有者创建新所有者
  • remove an owner 移除拥有者

The duplication just adds a vertical bar. 复制仅添加竖线。 In a graphical display, you could even split a bar by erasing its middle, ending with two half bars. 在图形显示中,您甚至可以通过擦除中间的条来分割条,并以两个半条结尾。 This is ownership being dissolved (as if you sold part of your shares of a traded corporation). 这就是所有权被解散(就像您出售了一家贸易公司的部分股份一样)。

The removal operation erases the vertical bar that belong to the owner (of course in practice there are no identified bars, not bars at all and its a decrement operation of an integer represented in binary); 删除操作会删除属于所有者的竖线(当然,实际上没有识别出的竖线,也没有横杠,并且它以二进制表示的整数的减量运算); if that operation removed that last bar then the removing thread is responsible for releasing the resource . 如果该操作删除了最后一个小节,则删除线程负责释放资源

You can easily see that a zero RC corresponds to an empty set of bars, which happens after all owners have given up on owning. 您可以轻松地看到零RC对应于一组空条,这是在所有所有者放弃所有权后发生的。 There is a race condition that determines the exact number of bars at any given time but that's a detail: what matters that each owner in each thread knows that he is an owner. 有一个竞争条件可以确定在任何给定时间的准确小节数目,但这很重要:每个线程中的每个所有者都知道自己是所有者,这很重要。 It's essentially indifferent to the number of other owners. 基本上与其他所有者的数量无关 (If you were not indifferent you probably would have wanted unique ownership in the first place, to be able to alter the resource without impact on other users.) (如果您没有兴趣,那么您可能首先希望拥有唯一所有权,以便能够更改资源而不影响其他用户。)

That there is a race condition on the set implies that internal atomics (or a similar alternative) must be used, but owners shouldn't care in general. 集合中存在竞争条件意味着必须使用内部原子(或类似的替代物),但所有者通常不应该在意。 It would be catastrophic if a particular "bar" was accidentally erased in the representation, it would mean that an owner is not accounted for and it would be a zombie owner: it would believe it owns the resource but it wouldn't actually own anything. 如果在表示中意外删除了某个特定的“条”,那将是灾难性的,这意味着所有者没有被考虑,并且将是僵尸所有者:它将认为它拥有资源,但实际上不拥有任何东西。 Atomic RMW (read modify write) operation guarantee that cannot happen: any atomic object modified exclusively by RMW operation cannot have lost modifications . 原子RMW(读修改写)操作保证不会发生: 专门由RMW操作修改的原子对象不能丢失修改

The very concept of ownership implies that it can't be created from nothing: you can only become the owner of something: 所有权的概念本身就意味着不能一无所有地创建所有权 :您只能成为某物的所有者:

  • by creating that thing 通过创造那个东西
  • or acquiring ownership (some) from someone who already has some 或从已经有一些人的人那里获得所有权(一些人)

That's just common sense. 那只是常识。

Because destruction of a share of ownership is irreversible, reaching zero owners is a terminating event; 由于所有权的破坏是不可逆的,因此达到零所有者是一个终止事件。 at that point, the RC is guaranteed to never be changed again, or to even be measured again. 在这一点上,保证RC再也不会改变,甚至无法再次测量。 (So ownership of the RC representation is superposable to ownership to the resource itself.) (因此,RC表示的所有权可与资源本身的所有权相叠加。)

These properties make RC implementation of true ownership very simple. 这些属性使RC真正所有权的实现非常简单。 This is different when weak references, that is RC-only watcher , enter the picture: a RC measurement tool provides the guarantee that it can make a measurement on the RC at any point in the future, whether or not the managed user resource still has owner(s), whether or not the user resource still exists. 当弱引用(即仅限RC的观察者 )进入图中时,情况就不同了:RC测量工具提供了保证,它可以在将来的任何时候对RC进行测量,无论受管用户资源是否仍然具有所有者,用户资源是否仍然存在。 With weak references a RC can be read as zero by an atomic operation, that is no vertical bar in the graphical representation of the set of owners. 对于弱引用,可以通过原子操作将RC读取为零,即在所有者集合的图形表示中没有竖线。 It means that the lifetime of the RC becomes distinct from the lifetime of the user resource: the RC itself becomes another managed resource (by internal RC). 这意味着RC的生存期变得不同于用户资源的生存期:RC本身成为另一个托管资源(通过内部RC)。

Weak references allow the creation of owners without sharing existing ownership: a weak reference is an (unreliable) "option" on future ownership. 弱引用会导致创建所有者而不共享现有所有权:弱引用是对未来所有权的(不可靠)“选择”。 Although weak references can only be created from real (ie strong) references, that conflicts with the general principles of ownership. 尽管只能从真实(即强)引用中创建弱引用,但这与所有权的一般原则相冲突。

So only with weak references, RC can be durably zero , that is zero outside the small interval between the last decrease operation and the freeing of the RC itself. 因此, 只有在弱引用的情况下, RC才能持久为零 ,即在最后一次减少操作与释放RC本身之间的较小间隔之外为零。 Using weak references means that user code must be designed to deal with the possibility of a zero RC, and that in MT code there can be a race between a thread giving up ownership as the last owner and another thread trying to recreate ownership from a weak reference. 使用弱引用意味着必须将用户代码设计为处理RC可能为零的情况,并且在MT代码中,可能会在一个放弃作为最后所有者的所有权的线程与另一个试图从弱者重新创建所有权的线程之间进行竞争。参考。

In real life, the traditional cultural knowledge is free and can be replicated as much as you want. 在现实生活中,传统文化知识是免费的,可以任意复制。 But for historical traditional knowledge known by extremely few people (like a family cooking recipe), you better make copies before the last person having that knowledge dies: there is a race between people dying with traditional knowledge and people acquiring and transmitting that knowledge. 但是,对于很少有人知道的历史传统知识(例如家庭烹饪食谱),您最好在拥有该知识的最后一个人去世之前进行复制:死于传统知识的人们与获取和传播该知识的人们之间存在着竞争 This is essentially the same as the weak reference/strong reference race issue. 这与弱参考/强参考竞赛问题基本相同。

So weak references can be a useful to model real world issues where an observer cannot forcibly keep a resource alive if external factor tear it down, but can observe the evolution of the resource while it exists. 因此,弱引用可以用于对现实世界中的问题进行建模,在这种情况下,如果外部因素破坏了观察者就无法强行使资源保持活动状态,但是可以观察到该资源存在时的演变。 Promoting a weak reference to a strong one takes a snapshot of the liveliness of the resource at the instant the conversion is done, and is inherently racy. 提倡对强项的弱引用会在转换完成的那一刻快照出资源的活泼性,这本质上就是不道德的。

Note that any meaningful use of many primitive is racy at some level : you use a mutex because you don't know which thread will need exclusive access to a resource first; 请注意, 对许多原语的任何有意义的使用在某种程度上都是明智的选择 :您使用互斥锁,因为您不知道哪个线程首先需要对资源的独占访问;您可以使用互斥量。 if you knew the exact order of execution, you would serialize the threads and avoid the complexity of threading altogether. 如果您知道执行的确切顺序,则可以对线程进行序列化,并完全避免了线程的复杂性。 A race to get access to a resource is not a bug. 争夺对资源的访问并不是错误。 Only when the correctness of a program execution depends on a particular order of events, there is a bug. 仅当程序执行的正确性取决于事件的特定顺序时,才会出现错误。

We know the destructor code below is supposed to release the control block if this is the last smart_ptr pointing to the resource being managed. 我们知道,如果这是指向被管理资源的最后一个smart_ptr,则下面的析构函数代码应该释放控制块。

Yes and this is correct, wanted behavior when the useful lifetime of the RC ends when it reaches zero, that is when true ownership is implemented and weak references are not supported. 是的,并且当RC的有效寿命达到零时(即实现了真正的所有权并且不支持弱引用时),这是正确的期望行为。 That would not be correct for a Boost or standard shared_ptr which does support "weak" pointers aka non owning observers. 对于Boost或标准的shared_ptr ,这确实是正确的,后者确实支持“弱”指针(又称为非所有者观察者)。

Although the race condition you mentioned in semantically impossible, there are issues here: 尽管您在语义上提到的竞争条件是不可能的,但这里有一些问题:

 ~smart_ptr() { if (control_block_ptr->refs.fetch_sub(1, memory_order_acq_rel) == 0) { delete control_block_ptr; } } 

As explained, the lifetime of the RC, when no pure observers (that are non owning and can see a zero RC) exist, is the same as the managed user resource. 如所解释的,当不存在纯粹的观察者(不属于观察者并且可以看到零RC)时,RC的生存期与受管用户资源相同。 I don't see the release of the user resource here (it might be elsewhere). 在这里看不到用户资源的释放(可能在其他地方)。

Where is the user resource itself managed? 用户资源本身在哪里管理? Is it in the destructor of the *control_block_ptr object? 它在*control_block_ptr对象的析构函数中吗? Can you post a bit more code to have a complete picture? 您可以再发布一些代码来获得完整的图片吗?

Also you used a the post -decrement operation fetch_sub instead of the pre-decrement operation: "post" operations return the previous value then perform the operation. 您还使用了递减操作fetch_sub而不是递减前操作:“ post”操作返回前一个值, 然后执行该操作。 With a post operation the special interesting RC value you want to act upon, the value before the last owner stops being an owner, is 1, not 0 . 通过后期操作,您要操作的特殊有趣的RC值(最后一个所有者停止成为所有者之前的值)为1,而不是0

If the constructor is non-buggy, then it will see that the refcount is already zero, and thus the destruction process has started and is irreversible. 如果构造函数不是越野车,那么它将看到refcount已经为零,因此销毁过程已经开始并且是不可逆的。 So from the POV of other threads, the object is already destroyed when the refcount hits zero, even though delete hasn't actually deallocated the memory yet. 因此,从其他线程的POV来看,当refcount达到零时,该对象已经被破坏,即使delete尚未真正释放内存。

Plus, if a destructor brings the refcount to zero, it means there aren't any more shared_pointer objects to copy-construct from. 另外,如果析构函数将refcount设为零,则意味着不再有任何shared_pointer对象可以从其进行复制构造。 (Unless you have a use-after-free bug for the object itself, not the control block. In which case you're using it wrong: normally you don't make references to shared_ptr objects.). (除非您对对象本身(而不是控制块)有一个“售后使用”的错误,在这种情况下您使用的是错误的:通常,您不会引用shared_ptr对象。)

So if I'm remembering correctly how shared_ptr works, this problem doesn't exist for it. 因此,如果我正确地记住了shared_ptr的工作方式,那么这个问题就不存在了。 (And thus why the refcount can be kept in the object that will be deallocated. In a non-garbage-collected environment, the reclaim problem is hard because you can't free memory that some other thread might still have a pointer to. See user-space vs. kernel RCU implementations for examples of the challenges. A lockless linked-list queue might always return nodes to a dedicated free-list for objects of that type, not a general pool that could let them be reused as something else or unmapped with a system call.) (因此,为什么可以将refcount保留将要释放的对象中。在非垃圾收集的环境中,回收问题很难解决,因为您无法释放某些其他线程可能仍具有指针的内存。请参见用户空间与内核RCU实现的比较,无锁链表队列可能总是将节点返回到该类型对象的专用空闲列表, 而不是使它们重新用作其他对象的通用池。未映射系统调用。)


But in the general case of refcounted objects, seeing refcount=0 means destruction has passed the point of no return , and your attempt to acquire a new reference to it has failed. 但是在一般情况下,使用refcount = 0的对象时,看到refcount = 0意味着销毁已经超过了不返回的地步 ,并且您尝试获取对该对象的新引用失败。 (ie happened after the destruction, in the global order established by the atomic counter). (即发生在销毁之后,以原子计数器建立的全局顺序)。

This "seeing" would be in the return value of a fetch_add(+1) , of course. 当然,这种“看到”将在fetch_add(+1)的返回值中。


Anyway, this design for attempts to acquire a new reference is what makes it safe to proceed to deallocation after your decrement made the refcount zero. 无论如何,这种用于尝试获取新引用的设计使得在您的递减使refcount为零之后可以安全地进行重新分配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM