简体   繁体   中英

Why is boost::mutex faster than std::mutex as of vs2013?

Today I wrote some code to test the performance of mutex.

This is the boost(1.54) version, compiled on vs2010 with O2 optimization:

boost::mutex m;
auto start = boost::chrono::system_clock::now();
for (size_t i = 0; i < 50000000; ++i) {
    boost::lock_guard<boost::mutex> lock(m);
}
auto end = boost::chrono::system_clock::now();
boost::chrono::duration<double> elapsed_seconds = end - start;
std::cout << elapsed_seconds.count() << std::endl;

And this is the std version, compiled on VS2013, with O2 optimization too:

std::mutex m;
auto start = std::chrono::system_clock::now();
for (size_t i = 0; i < 50000000; ++i) {
    std::lock_guard<std::mutex> lock(m);
}
auto end = std::chrono::system_clock::now();
std::chrono::duration<double> elapsed_seconds = end - start;
std::cout << elapsed_seconds.count() << std::endl;

A bit different but doing just the same thing. My CPU is Intel Core i7-2600K, my OS is Windows 7 64bit, and the result is: 0.7020s vs 2.1684s, 3.08 times.

boost::mutex will try _interlockedbittestandset first, and if it failed, the big cheese WaitForSingleObject will come second, it's simple to understand.

It seems that std::mutex of VS2013 is much more complex, I have already tried to understand it but I could not get the point, why it's so complex ? is there a faster way ?

It seems that stl::mutex might only use system calls, which take a LOT of overhead; but boost::mutex implements at least some of its functionality programmatically -- ie it tries to avoid system calls whenever possible, which would be the reason for the try _interlockedbittestandset check before WaitForSingleObject .

I don't know the actual internals of MS's stl, but I've seen performance differences like this from examples in an operating systems class.

The test is only testing the condition of locking an unlocked mutex without any contentions from other threads.

Let's say the mutex was locked. After the initial boost try, would it then be better for the thread to spin or block? It really all depends on the application. And maybe the stl one performs better under heavy load.

When times calls for a highly-efficient mutex, a lock-free alternative to achieve the same goals is worth exploring.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM