简体   繁体   English

std :: mutex :: try_lock虚假失败?

[英]std::mutex::try_lock spuriously fail?

Maybe I'm misunderstanding about std::mutex::try_lock : 也许std::mutex::try_lock 有误解

This function is allowed to fail spuriously and return false even if the mutex is not currently locked by any other thread. 即使互斥锁当前未被任何其他线程锁定,也允许此函数虚假失败并返回false。

This means that if no one thread has a lock on that mutex , when I try a try_lock it could return false ? 这意味着如果没有一个线程锁定该mutex ,当我尝试try_lock返回false For what purpose? 出于什么目的?

Isn't the function of try_lock return false if its locked OR true if nobody lock it? 是不是功能try_lock返回false ,如果它的锁定 true ,如果没有人锁了吗? Not really sure if my non-native english is fooling me... 不确定我的非母语英语是否在欺骗我...

This means that if no one thread has a lock of that mutex, when I try a try_lock, it could return false? 这意味着如果没有一个线程锁定该互斥锁,当我尝试try_lock时,它可能返回false?

Yes, that's exactly what it says. 是的,这正是它所说的。

Isn't the function of try_lock return false if its locked OR true if nobody lock it? 如果没有锁定,try_lock的函数是否会返回false,如果它被锁定则返回true?

No, the function of try_lock is to try to lock the mutex. 不, try_lock的功能是尝试锁定互斥锁。

However, there is more than one way it can fail: 但是,失败的方法不止一种:

  1. the mutex is already locked elsewhere (this is the one you're thinking of) 互斥锁已经锁定在别处(这是你想到的那个)
  2. some platform-specific feature interrupts or prevents the locking attempt, and control is returned to the caller who can decide whether to retry. 某些特定于平台的功能会中断或阻止锁定尝试,并且控制权将返回给可以决定是否重试的调用方。

The common case on POSIX-ish platforms, and inherited from POSIX threads, is that a signal is delivered to (and handled by a signal handler in) the current thread, interrupting the lock attempt. 在POSIX-ish平台上,从POSIX线程继承的常见情况是,信号被传递到当前线程(并由信号处理程序处理),从而中断锁定尝试。

There may be other platform-specific reasons on other platforms, but the behaviour is the same. 在其他平台上可能存在其他特定于平台的原因,但行为是相同的。

Based on your comments, I would write (quoting your words): 根据你的意见,我会写(引用你的话):

std::unique_lock<std::mutex> lock(m, std::defer_lock); // m being a mutex
...
if (lock.try_lock()) {
  ... // "DO something if nobody has a lock"
} else {
  ... // "GO AHEAD"
}

Note that lock.try_lock() effectively calls m.try_lock() , therefore it is prone to spurious fail as well. 请注意, lock.try_lock()有效地调用m.try_lock() ,因此它也容易出现虚假失败。 But I wouldn't care much about this issue. 但我对这个问题并不在意。 IMO, in practice, spurious fails/wakeups are quite rare (as Useless pointed out, on Linux, they can happen when a signal is delivered). IMO,在实践中,虚假失败/唤醒是非常罕见的(正如无用的指出,在Linux上,它们可能在信号传递时发生)。

More about spurious issues, see eg: https://en.wikipedia.org/wiki/Spurious_wakeup or Why does pthread_cond_wait have spurious wakeups? 有关虚假问题的更多信息,请参阅: https//en.wikipedia.org/wiki/Spurious_wakeup为什么pthread_cond_wait有虚假的唤醒? .

UPDATE UPDATE

If you really want to eliminate spurious fail of try_lock , you can use some atomic flag such as: 如果你真的想要消除try_lock虚假失败,你可以使用一些原子标志,例如:

// shared by threads:
std::mutex m;  
std::atomic<bool> flag{false};

// within threads:
std::unique_lock<std::mutex> lock(m, std::defer_lock); // m being a mutex
...
while (true) {
  lock.try_lock();
  if (lock.owns_lock()) {
    flag = true;
    ... // "DO something if nobody has a lock"    
    flag = false;
    break;
  } else if (flag == true) {
    ... // "GO AHEAD"
    break;
  }
}

It may be possibly rewritten to better form, I didn't check. 它可能可能被重写为更好的形式,我没有检查。 Also, note that flag is not automatically unset via RAII, some scope guard may be useful here. 此外,请注意, flag不会通过RAII自动取消设置,某些范围保护可能在此处有用。

UPDATE 2 更新2

If you do not need also the blocking functionality of mutex , use std::atomic_flag : 如果您还不需要mutex的阻止功能,请使用std::atomic_flag

std::atomic_flag lock = ATOMIC_FLAG_INIT;

// within threads:
if (lock.test_and_set()) {
    ... // "DO something if nobody has a lock"    
    lock.clear();
} else {
    ... // "GO AHEAD"
}

Just, again, clearing the flag would be better via some RAII mechanism. 再次,通过一些RAII机制清除标志会更好。

If the call to try_lock() returns true the call succeeded in locking the lock. 如果对try_lock()的调用返回true,则调用成功锁定锁。 If it returns false if it did not. 如果没有,则返回false。 That's all. 就这样。 Yes, the function can return false when nobody else has the lock. 是的,当没有其他人拥有锁时,该函数可以返回false。 False means only that the attempt to lock did not succeed; False 表示锁定尝试没有成功; it does not tell you why it failed. 它没有告诉你它失败的原因。

Unlike was said there, I don't think there is any reason for a try_lock function to failed due to OS-related reasons: such operation is non-blocking, so signals cannot really interrupt it. 与那里说的不同,我认为没有任何理由因为操作系统相关的原因导致try_lock函数失败:这样的操作是非阻塞的,因此信号不能真正中断它。 Most likely it has everything to do with how this function is implemented on CPU-level. 很可能它与在CPU级别上如何实现此功能有关。 After all, uncontested case is usually the most interesting one for a mutex. 毕竟,无争议的案例通常是互斥体最有趣的案例。

Mutex locking usually requires some form of atomic compare exchange operation. 互斥锁定通常需要某种形式的原子比较交换操作。 C++11 and C11 introduce atomic_compare_exchange_strong and atomic_compare_exchange_weak . C ++ 11和C11引入了atomic_compare_exchange_strongatomic_compare_exchange_weak The latter is allowed to fail spuriously. 允许后者虚假失败。

By allowing try_lock to fail spuriously, implementations are allowed to use atomic_compare_exchange_weak to maximize performance and minimize code size. 通过允许try_lock虚假失败,允许实现使用atomic_compare_exchange_weak来最大化性能并最小化代码大小。

For example on ARM64 atomic operations are usually implemented using exclusive-load ( LDXR ) and exclusive-store ( STRX ) instructions. 例如,在ARM64上,通常使用独占加载( LDXR )和独占存储( STRX )指令来实现原子操作。 LDXR fires up "monitor" hardware which starts tracking all accesses to a memory region. LDXR启动“监视器”硬件,该硬件开始跟踪对存储器区域的所有访问。 STRX only performs the store if no accesses to that region were made between LDXR and STRX instructions. 如果在LDXRSTRX指令之间没有访问该区域,则STRX仅执行存储。 Thus the whole sequence can fail spuriously if another thread accesses that memory region or if there was an IRQ in between the two. 因此,如果另一个线程访问该内存区域或者两者之间存在IRQ,则整个序列可能会失败。

In practice, code generate for try_lock implemented using weak guarantee is not very different from the one implemented using strong guarantee. 在实践中,使用弱保证实现的try_lock代码生成与使用强保证实现的代码没有很大不同。

bool mutex_trylock_weak(atomic_int *mtx)
{
    int old = 0;
    return atomic_compare_exchange_weak(mtx, &old, 1);
}

bool mutex_trylock_strong(atomic_int *mtx)
{
    int old = 0;
    return atomic_compare_exchange_strong(mtx, &old, 1);
}

Take a look at generated assembly for ARM64: 看看为ARM64生成的程序集:

mutex_trylock_weak:
  sub sp, sp, #16
  mov w1, 0
  str wzr, [sp, 12]
  ldaxr w3, [x0]      ; exclusive load (acquire)
  cmp w3, w1
  bne .L3
  mov w2, 1
  stlxr w4, w2, [x0]  ; exclusive store (release)
  cmp w4, 0           ; the only difference is here
.L3:
  cset w0, eq
  add sp, sp, 16
  ret

mutex_trylock_strong:
  sub sp, sp, #16
  mov w1, 0
  mov w2, 1
  str wzr, [sp, 12]
.L8:
  ldaxr w3, [x0]      ; exclusive load (acquire)
  cmp w3, w1
  bne .L9
  stlxr w4, w2, [x0]  ; exclusive store (release)
  cbnz w4, .L8        ; the only difference is here
.L9:
  cset w0, eq
  add sp, sp, 16
  ret

The only difference is that "weak" version eliminates conditional backward branch cbnz w4, .L8 and replaces it with cmp w4, 0 . 唯一的区别是“弱”版本消除了条件反向分支cbnz w4, .L8并用cmp w4, 0 cbnz w4, .L8替换它。 Backward condition branches are predicted by CPU as "will-be-taken" in the absense of branch prediction information as they are assumed to be part of a loop - such assumption is wrong in this case as most of the time lock will be acquired (low contention is assumed to be the most common case). 由于假设它们是循环的一部分,CPU将预测向后条件分支作为“将被采用”,因为它们被认为是循环的一部分 - 在这种情况下这种假设是错误的,因为大多数时间锁定将被获取(假设低争用是最常见的情况)。

Imo this is the only performance difference between those functions. Imo这是这些功能之间唯一的性能差异。 "Strong" version can basically suffer from 100% branch misprediction ratio on single instruction under some workloads. 在某些工作负载下,“强”版本在单指令上基本上会受到100%分支误预测率的影响。

By the way, ARMv8.1 introduces atomic instructions, so there is no difference between the two, just like on x86_64. 顺便说一句,ARMv8.1引入了原子指令,因此两者之间没有区别,就像在x86_64上一样。 Code generated with -march=armv8.1-a flag: -march=armv8.1-a生成的代码-march=armv8.1-a标志:

  sub sp, sp, #16
  mov w1, 0
  mov w2, 1
  mov w3, w1
  str wzr, [sp, 12]
  casal w3, w2, [x0]
  cmp w3, w1
  cset w0, eq
  add sp, sp, 16
  ret

Some try_lock functions can fail even when atomic_compare_exchange_strong is used, for example try_lock_shared of shared_mutex might need to increment reader counter and might fail if another reader has entered the lock. 即使使用atomic_compare_exchange_strong ,某些try_lock函数也会失败,例如, shared_mutex try_lock_shared可能需要增加读取器计数器,如果另一个读取器已进入锁定,则可能会失败。 "Strong" variant of such function would need to generate a loop and thus can suffer from a similar branch mispredication. 这种函数的“强”变体需要生成循环,因此可能遭受类似的分支错误预测。

Another minor detail: if mutex is written in C, some compilers (like Clang) might align loop at 16-byte boundary to improve its performance, bloating function body with padding. 另一个小细节:如果互斥体是用C语言编写的,那么一些编译器(如Clang)可能会在16字节边界处对齐循环以改善其性能,使用填充来增加函数体。 This is unnecessary if loop almost always runs a single time. 如果循环几乎总是运行一次,这是不必要的。


Another reason for spurious failure is a failure to acquire internal mutex lock (if mutex is implemented using a spinlock and some kernel primitive). 虚假失败的另一个原因是无法获取内部互斥锁(如果使用自旋锁和某些内核原语实现互斥锁)。 In theory the same principle could be acquired in the kernel implementation of try_lock , although this does not seems reasonable. 理论上,在try_lock的内核实现中可以获得相同的原则,尽管这似乎不合理。

In paper Foundations of the C++ Concurrency Memory Model Section 3, there is already a clear explanation for why the standard allows spurious failures of try_lock . C ++并发内存模型第3部分的纸基础中 ,已经清楚地解释了为什么该标准允许try_lock虚假失败。 In short, it is specified to make the semantics of try_lock be consistent with the definition of race in C++ memory model. 简而言之,它被指定使try_lock的语义与C ++内存模型中的种族定义一致。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM