[英]Why is std::unordered_map slow, and can I use it more effectively to alleviate that?
I've recently found out an odd thing.我最近发现了一件奇怪的事情。 It seems that calculating Collatz sequence lengths with no caching at all is over 2 times faster than using
std::unordered_map
to cache all elements .似乎在完全不缓存的情况下计算 Collatz 序列长度比使用
std::unordered_map
缓存所有元素快2 倍以上。
Note I did take hints from question Is gcc std::unordered_map implementation slow?注意我确实从问题中得到了提示gcc std::unordered_map 实现慢吗? If so - why?
如果是这样 - 为什么? and I tried to used that knowledge to make
std::unordered_map
perform as well as I could (I used g++ 4.6, it did perform better than recent versions of g++, and I tried to specify a sound initial bucket count, I made it exactly equal to the maximum number of elements the map must hold).我试图利用这些知识使
std::unordered_map
尽可能地发挥作用(我使用了 g++ 4.6,它的性能确实比 g++ 的最新版本更好,并且我试图指定一个合理的初始桶数,我完全做到了等于地图必须容纳的最大元素数)。
In comparision, using std::vector
to cache a few elements was almost 17 times faster than no caching at all and almost 40 times faster than using std::unordered_map
.相比之下,使用
std::vector
缓存一些元素比完全不缓存快 17 倍,比使用std::unordered_map
快近 40 倍。
Am I doing something wrong or is this container THAT slow and why?我做错了什么还是这个容器很慢,为什么? Can it be made performing faster?
可以让它执行得更快吗? Or maybe hashmaps are inherently ineffective and should be avoided whenever possible in high-performance code?
或者哈希图本质上是无效的,应该在高性能代码中尽可能避免?
The problematic benchmark is:有问题的基准是:
#include <iostream>
#include <unordered_map>
#include <cstdint>
#include <ctime>
std::uint_fast16_t getCollatzLength(std::uint_fast64_t val) {
static std::unordered_map <std::uint_fast64_t, std::uint_fast16_t> cache ({{1,1}}, 2168611);
if(cache.count(val) == 0) {
if(val%2 == 0)
cache[val] = getCollatzLength(val/2) + 1;
else
cache[val] = getCollatzLength(3*val+1) + 1;
}
return cache[val];
}
int main()
{
std::clock_t tStart = std::clock();
std::uint_fast16_t largest = 0;
for(int i = 1; i <= 999999; ++i) {
auto cmax = getCollatzLength(i);
if(cmax > largest)
largest = cmax;
}
std::cout << largest << '\n';
std::cout << "Time taken: " << (double)(std::clock() - tStart)/CLOCKS_PER_SEC << '\n';
}
It outputs: Time taken: 0.761717
它输出:
Time taken: 0.761717
Whereas a benchmark with no caching at all:而完全没有缓存的基准测试:
#include <iostream>
#include <unordered_map>
#include <cstdint>
#include <ctime>
std::uint_fast16_t getCollatzLength(std::uint_fast64_t val) {
std::uint_fast16_t length = 1;
while(val != 1) {
if(val%2 == 0)
val /= 2;
else
val = 3*val + 1;
++length;
}
return length;
}
int main()
{
std::clock_t tStart = std::clock();
std::uint_fast16_t largest = 0;
for(int i = 1; i <= 999999; ++i) {
auto cmax = getCollatzLength(i);
if(cmax > largest)
largest = cmax;
}
std::cout << largest << '\n';
std::cout << "Time taken: " << (double)(std::clock() - tStart)/CLOCKS_PER_SEC << '\n';
}
Outputs Time taken: 0.324586
输出
Time taken: 0.324586
The standard library's maps are, indeed, inherently slow ( std::map
especially but std::unoredered_map
as well).标准库的映射确实本质上很慢(尤其是
std::map
但std::unoredered_map
也是如此)。 Google's Chandler Carruth explains this in his CppCon 2014 talk ; Google 的 Chandler Carruth 在他的CppCon 2014 演讲中解释了这一点; in a nutshell:
std::unordered_map
is cache-unfriendly because it uses linked lists as buckets.简而言之:
std::unordered_map
缓存不友好,因为它使用链表作为存储桶。
This SO question mentioned some efficient hash map implementations - use one of those instead.这个 SO 问题提到了一些有效的哈希映射实现 - 改用其中之一。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.