[英]I which situation will std::map<A,B> be faster than sorted std::vector<std::pair<A,B>>?
I was using a map
in some code to store ordered data. 我在一些代码中使用
map
来存储有序数据。 I found out that for huge maps, destruction could take a while. 我发现对于巨大的地图,破坏可能需要一段时间。 In this code I had, replacing
map
by vector<pair>
reduced processing time by 10000... 在这个代码我有,用
vector<pair>
替换map
减少处理时间10000 ...
Finally, I was so surprised that I decided to compare map
performances with sorted vector
or pair
. 最后,我很惊讶我决定将
map
表现与排序的vector
或pair
进行比较。
And I'm surprised because I could not find a situation where map
was faster than a sorted vector
of pair
(filled randomly and later sorted)...there must be some situations where map
is faster....else what's the point in providing this class? 我很惊讶,因为我无法找到的情况下
map
比一个快排序vector
的pair
(随机填充后排序)......必须有某些情况下map
快....还有什么是在点提供这个课程?
Here is what I tested: 这是我测试的:
Test one, compare map
filling and destroying vs vector
filling, sorting (because I want a sorted container) and destroying: 测试一,比较
map
填充和销毁与vector
填充,排序(因为我想要一个已排序的容器)和销毁:
#include <iostream>
#include <time.h>
#include <cstdlib>
#include <map>
#include <vector>
#include <algorithm>
int main(void)
{
clock_t tStart = clock();
{
std::map<float,int> myMap;
for ( int i = 0; i != 10000000; ++i )
{
myMap[ ((float)std::rand()) / RAND_MAX ] = i;
}
}
std::cout << "Time taken by map: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
tStart = clock();
{
std::vector< std::pair<float,int> > myVect;
for ( int i = 0; i != 10000000; ++i )
{
myVect.push_back( std::make_pair( ((float)std::rand()) / RAND_MAX, i ) );
}
// sort the vector, as we want a sorted container:
std::sort( myVect.begin(), myVect.end() );
}
std::cout << "Time taken by vect: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
return 0;
}
Compiled with g++ main.cpp -O3 -o main
and got: 编译用
g++ main.cpp -O3 -o main
得到:
Time taken by map: 21.7142
Time taken by vect: 7.94725
map
's 3 times slower... map
的速度慢了3倍......
Then, I said, "OK, vector is faster to fill and sort, but search will be faster with the map"....so I tested: 然后,我说,“好吧,矢量填充和排序速度更快,但地图搜索会更快”......所以我测试了:
#include <iostream>
#include <time.h>
#include <cstdlib>
#include <map>
#include <vector>
#include <algorithm>
int main(void)
{
clock_t tStart = clock();
{
std::map<float,int> myMap;
float middle = 0;
float last;
for ( int i = 0; i != 10000000; ++i )
{
last = ((float)std::rand()) / RAND_MAX;
myMap[ last ] = i;
if ( i == 5000000 )
middle = last; // element we will later search
}
std::cout << "Map created after " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
float sum = 0;
for ( int i = 0; i != 10; ++i )
sum += myMap[ last ]; // search it
std::cout << "Sum is " << sum << std::endl;
}
std::cout << "Time taken by map: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
tStart = clock();
{
std::vector< std::pair<float,int> > myVect;
std::pair<float,int> middle;
std::pair<float,int> last;
for ( int i = 0; i != 10000000; ++i )
{
last = std::make_pair( ((float)std::rand()) / RAND_MAX, i );
myVect.push_back( last );
if ( i == 5000000 )
middle = last; // element we will later search
}
std::sort( myVect.begin(), myVect.end() );
std::cout << "Vector created after " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
float sum = 0;
for ( int i = 0; i != 10; ++i )
sum += (std::find( myVect.begin(), myVect.end(), last ))->second; // search it
std::cout << "Sum is " << sum << std::endl;
}
std::cout << "Time taken by vect: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
return 0;
}
Compiled with g++ main.cpp -O3 -o main
and got: 编译用
g++ main.cpp -O3 -o main
得到:
Map created after 19.5357
Sum is 1e+08
Time taken by map: 21.41
Vector created after 7.96388
Sum is 1e+08
Time taken by vect: 8.31741
Even search is apparently faster with the vector
(10 searchs with the map
took almost 2sec and it took only half a second with the vector
).... 使用
vector
显然搜索速度更快(使用map
进行10次搜索花费了大约2秒,使用vector
只需要半秒钟)....
So: 所以:
map
simply a class to avoid or is there really situations where map
offers good performances? map
只是一个要避免的类,还是有map
提供良好表现的情况? Generally a map
will be better when you're doing a lot of insertions and deletions interspersed with your lookups. 通常,当您在查找中穿插大量插入和删除时,
map
会更好。 If you build the data structure once and then only do lookups, a sorted vector
will almost certainly be faster, if only because of processor cache effects. 如果您构建一次数据结构然后只进行查找,那么排序的
vector
几乎肯定会更快,只是因为处理器缓存效应。 Since insertions and deletions at arbitrary locations in a vector are O(n) instead of O(log n), there will come a point where those will become the limiting factor. 由于向量中任意位置的插入和删除都是O(n)而不是O(log n),因此这些将成为限制因素。
std::find
has linear time complexity whereas a map
search has log N complexity. std::find
具有线性时间复杂度,而map
搜索具有log N复杂度。
When you find that one algorithm is 100000x faster than the other you should get suspicious! 当你发现一个算法比另一个算法快100000倍时,你会产生怀疑! Your benchmark is invalid.
您的基准无效。
You need to compare realistic variants. 您需要比较现实的变体。 Probably, you meant to compare map with a binary search.
可能,您的意思是将地图与二进制搜索进行比较。 Run each of those variants for at least 1 second of CPU time so that you can realistically compare the results.
运行每个变量至少1秒的CPU时间,以便您可以实际比较结果。
When a benchmark returns "0.00001 seconds" time spent you are well in the clock inaccuracy noise. 当基准测试返回“0.00001秒”时间时,您可以很好地处理时钟误差。 This number means nothing.
这个数字什么都没有。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.