Today I wrote some code to test the performance of mutex.
This is the boost(1.54) version, compiled on vs2010 with O2 optimization:
boost::mutex m;
auto start = boost::chrono::system_clock::now();
for (size_t i = 0; i < 50000000; ++i) {
boost::lock_guard<boost::mutex> lock(m);
}
auto end = boost::chrono::system_clock::now();
boost::chrono::duration<double> elapsed_seconds = end - start;
std::cout << elapsed_seconds.count() << std::endl;
And this is the std version, compiled on VS2013, with O2 optimization too:
std::mutex m;
auto start = std::chrono::system_clock::now();
for (size_t i = 0; i < 50000000; ++i) {
std::lock_guard<std::mutex> lock(m);
}
auto end = std::chrono::system_clock::now();
std::chrono::duration<double> elapsed_seconds = end - start;
std::cout << elapsed_seconds.count() << std::endl;
A bit different but doing just the same thing. My CPU is Intel Core i7-2600K, my OS is Windows 7 64bit, and the result is: 0.7020s vs 2.1684s, 3.08 times.
boost::mutex will try _interlockedbittestandset first, and if it failed, the big cheese WaitForSingleObject will come second, it's simple to understand.
It seems that std::mutex of VS2013 is much more complex, I have already tried to understand it but I could not get the point, why it's so complex ? is there a faster way ?
It seems that stl::mutex
might only use system calls, which take a LOT of overhead; but boost::mutex
implements at least some of its functionality programmatically -- ie it tries to avoid system calls whenever possible, which would be the reason for the try _interlockedbittestandset
check before WaitForSingleObject
.
I don't know the actual internals of MS's stl, but I've seen performance differences like this from examples in an operating systems class.
The test is only testing the condition of locking an unlocked mutex without any contentions from other threads.
Let's say the mutex was locked. After the initial boost try, would it then be better for the thread to spin or block? It really all depends on the application. And maybe the stl one performs better under heavy load.
When times calls for a highly-efficient mutex, a lock-free alternative to achieve the same goals is worth exploring.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.