简体   繁体   English

为什么 std::mutex 在 Visual C++ 中比 std::shared_mutex 差这么多?

[英]Why is std::mutex so much worse than std::shared_mutex in Visual C++?

Ran the following in Visual Studio 2022 in release mode:在发布模式下在 Visual Studio 2022 中运行以下命令:

#include <chrono>
#include <mutex>
#include <shared_mutex>
#include <iostream>

std::mutex mx;
std::shared_mutex smx;

constexpr int N = 100'000'000;

int main()
{
    auto t1 = std::chrono::steady_clock::now();
    for (int i = 0; i != N; i++)
    {
        std::unique_lock<std::mutex> l{ mx };
    }
    auto t2 = std::chrono::steady_clock::now();
    for (int i = 0; i != N; i++)
    {
        std::unique_lock<std::shared_mutex> l{ smx };
    }
    auto t3 = std::chrono::steady_clock::now();

    auto d1 = std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1);
    auto d2 = std::chrono::duration_cast<std::chrono::duration<double>>(t3 - t2);

    std::cout << "mutex " << d1.count() << "s;  shared_mutex " << d2.count() << "s\n";
    std::cout << "mutex " << sizeof(mx) << " bytes;  shared_mutex " << sizeof(smx) << " bytes \n";
}

The output is as follows:输出如下:

mutex 2.01147s;  shared_mutex 1.32065s
mutex 80 bytes;  shared_mutex 8 bytes

Why so?为什么这样?

It is unexpected that more rich in features std::shared_mutex is faster than std::mutex , which is strictly a subset in its features.出乎意料的是,更丰富的特性std::shared_mutexstd::mutex更快,后者严格来说是其特性的一个子集。

TL;DR: unfortunate combination of backward compatibility and ABI compatibility issues makes std::mutex bad until the next ABI break. TL;DR:向后兼容性和 ABI 兼容性问题的不幸组合使std::mutex在下一次 ABI 中断之前变得糟糕。 OTOH, std::shared_mutex is good. OTOH, std::shared_mutex很好。


A decent implementation of std::mutex would try to use an atomic operation to acquire the lock, if busy, possibly would try spinning in a read loop (with some pause on x86), and ultimately will resort to OS wait.一个体面的std::mutex会尝试使用原子操作来获取锁,如果忙,可能会尝试在读取循环中旋转(在 x86 上有一些pause ),最终将求助于操作系统等待。

There are a couple of ways to implement such std::mutex :有几种方法可以实现这样的std::mutex

  1. Directly delegate to corresponding OS APIs that do all of above.直接委托给执行上述所有操作的相应 OS API。
  2. Do spinning and atomic thing on its own, call OS APIs only for OS wait.自己做旋转和原子的事情,只为操作系统等待调用操作系统 API。

Sure, the first way is easier to implement, more friendly to debug, more robust.当然,第一种方式更容易实现,对调试更友好,更健壮。 So it appears to be the way to go.所以这似乎是要走的路。 The candidate APIs are:候选 API 是:

  • CRITICAL_SECTION APIs. CRITICAL_SECTION API。 A recursive mutex, that is lacking static initializer and needs explicit destruction递归互斥锁,缺少静态初始化器,需要显式销毁
  • SRWLOCK . SRWLOCK A non-recursive shared mutex that has static initializer and doesn't need explicit destruction具有静态初始值设定项且不需要显式销毁的非递归共享互斥锁
  • WaitOnAddress . WaitOnAddress An API to wait on particular variable to be changed, similar to Linux futex .等待特定变量更改的 API,类似于 Linux futex

These primitives have OS version requirements:这些原语具有操作系统版本要求:

  • CRITICAL_SECTION existed since I think Windows 95, though TryEnterCriticalSection was not present in Windows 9x, but the ability to use CRITICAL_SECTION with CONDITION_VARIABLE was added since Windows Vista, with CONDITION_VARIABLE itself. CRITICAL_SECTION认为 Windows 95 以来就存在,尽管TryEnterCriticalSection在 Windows 9x 中不存在,但自 Windows Vista 以来添加了将CRITICAL_SECTIONCONDITION_VARIABLE一起使用的功能,并带有CONDITION_VARIABLE本身。
  • SRWLOCK exists since Windows Vista, but TryAcquireSRWLockExclusive exists since Windows 7, so it can only directly implement std::mutex starting in Windows 7. SRWLOCK从Windows Vista TryAcquireSRWLockExclusive存在,而TryAcquireSRWLockExclusive从Windows 7 TryAcquireSRWLockExclusive存在,所以只能直接实现Windows 7开始的std::mutex
  • WaitOnAddress was added since Windows 8.自 Windows 8 起添加了WaitOnAddress

By the time when std::mutex was added, Windows XP support by Visual Studio C++ library was needed, so it was implemented using doing things on its own.到加入std::mutex时,Windows XP 需要 Visual Studio C++ 库的支持,所以它是使用自己做事来实现的。 In fact, std::mutex and other sync stuff was delegated to ConCRT ( Concurrency Runtime )事实上, std::mutex和其他同步的东西被委托给 ConCRT ( Concurrency Runtime )

For Visual Studio 2015, the implementation was switched to use the best available mechanism, that is SRWLOCK starting in Windows 7, and CRITICAL_SECTION stating in Windows Vista.对于 Visual Studio 2015,实现已切换为使用最佳可用机制,即 Windows 7 中的SRWLOCK和 Windows Vista 中的CRITICAL_SECTION声明。 ConCRT turned out to be not the best mechanism, but it still was used for Windows XP and 2003. The polymorphism was implemented by making placement new of classes with virtual functions into a buffer provided by std::mutex and other primitives. ConCRT 被证明不是最好的机制,但它仍然被用于 Windows XP 和 2003。通过将具有虚函数的类放置到std::mutex和其他原语提供的缓冲区中来实现多态性。

Note that this implementation breaks the requirement for std::mutex to be constexpr , because of runtime detection, placement new, and inability of pre-Window 7 implementation to have only static initializer.请注意,此实现打破了std::mutex必须为constexpr的要求,因为运行时检测、新放置以及 Window 7 之前的实现无法仅具有静态初始值设定项。

As time passed support of Windows XP was finally dropped in VS 2019, and support of Windows Vista was dropped in VS 2022, the change is made to avoid ConCRT usage, the change is planned to avoid even runtime detection of SRWLOCK (disclosure: I've contributed these PRs).随着时间的推移,VS 2019 最终放弃了对 Windows XP 的支持,VS 2022 中放弃了对 Windows Vista 的支持,进行了更改以避免使用 ConCRT,该更改计划避免甚至运行时检测 SRWLOCK(披露:I'我贡献了这些 PR)。 Still due to ABI compatibility for VS 2015 though VS 2022 it is not possible to simplify std::mutex implementation to avoid all this putting classes with virtual functions.仍然由于 VS 2015 与 VS 2022 的 ABI 兼容性,无法简化std::mutex实现以避免所有这些将类与虚函数放置在一起。

What is more sad, though SRWLOCK has static initializer, the said compatibility prevents from having constexpr mutex: we have to placement new the implementation there.更可悲的是,虽然SRWLOCK有静态初始化器,但上述兼容性阻止了constexpr互斥:我们必须在那里放置新的实现。 It is not possible to avoid placement new, and make an implementation to construct right inside std::mutex , because std::mutex has to be standard layout class (see Why is std::mutex a standard-layout class? ).不可能避免放置 new,并在std::mutex内部构造一个实现,因为std::mutex必须是标准布局类(请参阅为什么 std::mutex 是标准布局类? )。

So the size overhead comes from the size of ConCRT mutex.所以大小开销来自于 ConCRT 互斥的大小。 And the runtime overhead comes from the chain of call:运行时开销来自调用链:

  • library function call to get to the standard library implementation库函数调用以获取标准库实现
  • virtual function call to get to SRWLOCK -based implementation虚拟函数调用以获得基于SRWLOCK的实现
  • finally Windows API call.最后是 Windows API 调用。

Virtual function call is more expensive than usually due to standard library DLLs being built with /guard:cf .由于使用/guard:cf构建标准库 DLL,虚拟函数调用比通常更昂贵。


std::shared_mutex was designed to support only systems starting Windows 7. So it uses SRWLOCK directly. std::shared_mutex旨在仅支持启动 Windows 7 的系统。因此它直接使用SRWLOCK

The size of std::shared_mutex is the size of SRWLOCK . std::shared_mutex的大小是SRWLOCK的大小。 SRWLOCK has the same size as a pointer (though internally it is not a pointer). SRWLOCK具有与指针相同的大小(尽管在内部它不是指针)。

It still involves some avoidable overhead: it calls C++ runtime library, just to call Windows API, instead of calling Windows API directly.它仍然涉及一些可以避免的开销:它调用 C++ 运行时库,只是为了调用 Windows API,而不是直接调用 Windows API。 This looks fixable with the next ABI, though.不过,这看起来可以通过下一个 ABI 解决。

std::shared_mutex constructor could be constexpr, as SRWLOCK does not need dynamic initializer, but the standard prohibits voluntary adding constexpr to the standard classes. std::shared_mutex构造函数可以是 constexpr,因为SRWLOCK不需要动态初始值设定项,但标准禁止自愿将constexpr添加到标准类。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM