用 std::atomic 实现的 shared_lock

Question

I can't afford to hold shared_lock due to its weight.由于它的重量，我不能持有 shared_lock 。 Instead I implemented what I think is shared_lock but with atomic value.相反，我实现了我认为是 shared_lock 但具有原子价值的东西。 Will this code work, or have I missed something?这段代码会起作用，还是我错过了什么？

update: I added RAII classes around atomic mutex.更新：我在原子互斥体周围添加了 RAII 类。

namespace atm {
static const unsigned int g_unlocked = 0;
static const unsigned int g_exclusive = std::numeric_limits<unsigned int>::max();
using mutex_type = std::atomic<unsigned int>;

class exclusive_lock {
  public:
    exclusive_lock(mutex_type& mutex) : _mutex(mutex) {
      unsigned int expected = g_unlocked;
      while (!_mutex.compare_exchange_weak(expected, g_exclusive)) {
        _mm_pause();
        expected = g_unlocked;
      }
    }
    ~exclusive_lock() { _mutex.store(g_unlocked, std::memory_order_release); }
  private:
    mutex_type& _mutex;
};

class shared_lock {
  public:
    shared_lock(mutex_type& mutex) : _mutex(mutex) {
      unsigned int expected = _mutex;
      while (expected == g_exclusive || !_mutex.compare_exchange_weak(expected, expected + 1)) {
        _mm_pause();
        expected = _mutex;
      }
    }
    ~shared_lock() {  _mutex.fetch_sub(1, std::memory_order_release); }
  private:
    mutex_type& _mutex;
};
} // namespace atm

Answer 1

For correctness I think this looks reasonable, I don't see a problem.为了正确起见，我认为这看起来很合理，我认为没有问题。 I might have missed something, but a seq_cst CAS is more than sufficient for acquiring a lock.我可能错过了一些东西，但是 seq_cst CAS 足以获得锁。 It looks like you're avoiding integer wrap-around by using the max value as something special.看起来你通过使用最大值作为特殊的东西来避免 integer 环绕。

The mutex=0 unlock only needs to be release , not seq_cst . mutex=0解锁只需要release ，而不是seq_cst 。 (Same for the -=1 shared unlock, but that won't make it more efficient on x86, only on weakly-ordered ISAs). （与-=1共享解锁相同，但这不会使其在 x86 上更有效，仅在弱排序 ISA 上）。 Also, compare_exchange_weak would be totally fine;另外， compare_exchange_weak完全没问题； you're retrying in a loop anyway so spurious failure is not different from a failed compare.无论如何，您都在循环重试，因此虚假失败与失败的比较没有什么不同。

If you're on x86, you'd normally want _mm_pause() inside your spin loop, and possibly some kind of backoff to reduce contention if multiple threads are all trying to acquire the lock at once.如果您使用的是 x86，您通常希望在自旋循环中使用_mm_pause() ，并且如果多个线程都试图同时获取锁，则可能需要某种退避来减少争用。

And usually you want to spin read-only until you see the lock available, not keep hammering with atomic RMWs.而且通常您希望以只读方式旋转，直到您看到可用的锁，而不是继续使用原子 RMW。 (See Does cmpxchg write destination cache line on failure? If not, is it better than xchg for spinlock? ). （请参阅是否 cmpxchg 在失败时写入目标缓存行？如果不是，它是否比 xchg 用于自旋锁更好？）。

Also, short is a strange choice;此外， short是一个奇怪的选择。 if any size is going to perform worse than int, it's often short.如果任何大小的性能比 int 差，它通常很短。 But probably it's fine, and ok I guess if that helps it pack into the same cache line as the data you're modifying.但可能没问题，好吧，我想这是否有助于将它打包到与您正在修改的数据相同的缓存行中。 (Although that cache line will be the victim of false sharing contention from other threads hammering on it trying to take the lock.) （尽管该缓存行将成为其他线程试图获取锁的虚假共享争用的受害者。）

用 std::atomic 实现的 shared_lock

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-01-21 15:11:27

用 std::atomic 实现的 shared_lock

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-01-21 15:11:27

解决方案1
1 已采纳 2021-01-21 15:11:27