简体   繁体   English

更便宜的 std::atomic 替代品<bool> ?

[英]Cheaper alternative to std::atomic<bool>?

I have a class of objects in a multithreaded application where each thread can mark an object for deletion, then a central garbage collector thread actually deletes the object.我在多线程应用程序中有一类对象,其中每个线程都可以将对象标记为删除,然后中央垃圾收集器线程实际删除该对象。 The threads communicate via member methods that access an internal bool:线程通过访问内部 bool 的成员方法进行通信:

class MyObjects {
...   
bool shouldBeDeleted() const
{
   return m_Delete;
}

void
markForDelete()
{
   m_Delete = true;
}
...
   std::atomic< bool >                                        m_IsObsolete;
}

The bool has been made an atomic by someone else in the past because Thread Sanitizer kept complaining.过去,由于 Thread Sanitizer 一直在抱怨,因此其他人已将 bool 设为原子。 However, perf suggests now that there is a processing overhead during the internal atomic load:但是, perf 现在表明在内部原子负载期间存在处理开销:

   │     ↓ cbz    x0, 3f4                                                                                                                                                                                                                                                                                                                                                                                            

   │     _ZNKSt13__atomic_baseIbE4loadESt12memory_order():                                                                                                                                                                                                                                                                                                                                                           

   │           {                                                                                                                                                                                                                                                                                                                                                                                                     

   │             memory_order __b = __m & __memory_order_mask;                                                                                                                                                                                                                                                                                                                                                       

   │             __glibcxx_assert(__b != memory_order_release);                                                                                                                                                                                                                                                                                                                                                      

   │             __glibcxx_assert(__b != memory_order_acq_rel);                                                                                                                                                                                                                                                                                                                                                      

   │                                                                                                                                                                                                                                                                                                                                                                                                                 

   │             return __atomic_load_n(&_M_i, __m);                                                                                                                                                                                                                                                                                                                                                                 

   │       add    x0, x0, #0x40                                                                                                                                                                                                                                                                                                                                                                                          

 86,96 │       ldarb  w0, [x0]  

Target platform is GCC, Aarch64 and Yocto Linux.目标平台为 GCC、Aarch64 和 Yocto Linux。

Now my questions are as follows:现在我的问题如下:

  • Is atomic really needed in this case?在这种情况下真的需要原子吗? The transition of the bool is one way (from false to true) with no way back while the object lives, so an inconsistency would merely mean that the object is deleted a little later, right? bool 的转换是一种方式(从 false 到 true),在对象存在期间无法返回,因此不一致仅意味着稍后删除对象,对吗?

  • Is there an alternative to std::atomic<bool> that will silence Thread Sanitizer but is computationally cheaper than std::atomic<bool> ?是否有std::atomic<bool>的替代方案可以使 Thread Sanitizer 静音,但在计算上比std::atomic<bool>便宜?

An obvious modification could be to specify memory_order_relaxed to minimise memory barriers.一个明显的修改可能是指定memory_order_relaxed以最小化内存障碍。

See https://en.cppreference.com/w/cpp/atomic/memory_order请参阅https://en.cppreference.com/w/cpp/atomic/memory_order

and https://bartoszmilewski.com/2008/12/01/c-atomics-and-memory-ordering/https://bartoszmilewski.com/2008/12/01/c-atomics-and-memory-ordering/

Also see Herb Sutter's classic "Atomic Weapons" : https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2另请参阅 Herb Sutter 的经典《原子武器》: https : //channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2

m_Delete.store (true, std::memory_order_relaxed);

Caveat ( see articles above ) - if there are any co-dependencies on the object being flagged for deletion (eg another state variable, freeing resources etc) then you may need to use memory_order_release to ensure that the can be deleted flag setting occurs last and is not reordered by the compiler optimiser.警告请参阅上面的文章)-如果对被标记为删除的对象存在任何共同依赖关系(例如另一个状态变量、释放资源等),那么您可能需要使用memory_order_release来确保can be deleted标记设置发生在最后和编译器优化器不会重新排序。

Assuming the "garbage collector" is only checking the can be deleted flag alone it would not need to use memory_order_acquire in the load;假设“垃圾收集器”检查can be deleted标志,则不需要在加载中使用memory_order_acquire relaxed would be sufficient.放松就足够了。 Otherwise it would need to use acquire to guarantee that any co-dependent accesses are not reordered to occur before reading the flag.否则,它需要使用获取来保证在读取标志之前不会重新排序任何相互依赖的访问。

The problem (as clarified in a comment by the OP) is not a true GC but is instead delayed deletion of objects on a separate thread so as to unburden the main processing threads from the time it takes to to the delete.问题(如 OP 的评论中所澄清)不是真正的 GC,而是延迟删除单独线程上的对象,以便减轻主处理线程从删除到删除所需的时间。 All objects to be deleted are marked such at some time - at a later time the deletion thread comes along and deletes them.所有要删除的对象都会在某个时间被标记——稍后删除线程出现并删除它们。

Consider first: Is it really the case that delayed deletion is necessary in order to meet the program's performance goals - specifically, latency?首先考虑:是否真的需要延迟删除才能满足程序的性能目标——特别是延迟? It might just be extra overhead that actually impacts latency.实际上影响延迟的可能只是额外的开销。 (Or perhaps there are also different performance goals, eg, throughput, to be considered.) Delayed deletion isn't an obvious performance win in all cases - you need to find out if it is appropriate in each case. (或者可能还有不同的性能目标,例如吞吐量,需要考虑。)延迟删除并不是所有情况下的明显性能优势 - 您需要找出在每种情况下它是否合适。 (For example, it might not even be necessary for all deletions: perhaps some deletions can be done immediately in-line without impacting performance while others need to be deferred. This could be because, for example, different processing threads are doing different things with different latency/throughput requirements.) (例如,它甚至可能不是所有删除必须的:也许某些删除可以在不影响性能的情况下立即执行,而其他删除则需要延迟。这可能是因为,例如,不同的处理线程正在执行不同的操作不同的延迟/吞吐量要求。)

Now to a solution: Since we're talking deferred deletion - there is no reason the deletion thread needs to scan all objects looking for the ones to delete (each time it does a full scan).现在有一个解决方案:由于我们正在谈论延迟删除 - 删除线程没有理由需要扫描所有对象以查找要删除的对象(每次进行完整扫描时)。 Instead, pay a slightly larger cost at the time you mark the object for deletion and pay no cost to scan all objects.相反,在将对象标记为删除时支付稍高的成本,而无需支付扫描所有对象的成本。 Do this by linking deleted objects onto a deletion work list.通过将已删除的对象链接到删除工作列表来执行此操作。 There is a synchronization cost there (which can be minimized in various ways besides obvious locks) but it is paid once per object not once per object per scan .那里有一个同步成本(除了明显的锁之外,还可以通过各种方式将其最小化),但它是为每个对象支付一次,而不是每个对象每次扫描支付一次

(Doesn't have to be a linked list either. If there is an upper bound to how many objects can be deleted in a period of time you can just use an appropriate array.) (也不必是链表。如果在一段时间内可以删除多少对象有上限,您可以使用适当的数组。)

There are other possibilities that are opened up by characterizing this problem more precisely as "deferred deletion" rather than "garbage collection": some constraints are lifted (perhaps others are added).通过将这个问题更准确地描述为“延迟删除”而不是“垃圾收集”,还开辟了其他可能性:取消了一些限制(也许添加了其他限制)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM