什么容器类型提供比std :: map更好（平均）的性能？

Question

In the following example a std::map structure is filled with 26 values from A - Z (for key) and 0 - 26 for value. 在下面的示例中，std :: map结构填充了来自A - Z（对于键）的26个值和对于值的0 - 26。 The time taken (on my system) to lookup the last entry (10000000 times) is roughly 250 ms for the vector, and 125 ms for the map. （在我的系统上）查找最后一个条目（10000000次）的时间对于向量大约为250毫秒，对于映射大约为125毫秒。 (I compiled using release mode, with O3 option turned on for g++ 4.4) （我使用发布模式编译，为g ++ 4.4启用了O3选项）

But if for some odd reason I wanted better performance than the std::map, what data structures and functions would I need to consider using? 但是，如果由于一些奇怪的原因我想要比std :: map更好的性能，我需要考虑使用哪些数据结构和函数？

I apologize if the answer seems obvious to you, but I haven't had much experience in the performance critical aspects of C++ programming. 如果答案对您来说显而易见，我深表歉意，但我对C ++编程的性能关键方面没有太多经验。

#include <ctime>
#include <map>
#include <vector>
#include <iostream>

struct mystruct
{
    char key;
    int value;

    mystruct(char k = 0, int v = 0) : key(k), value(v) { }
};

int find(const std::vector<mystruct>& ref, char key)
{
    for (std::vector<mystruct>::const_iterator i = ref.begin(); i != ref.end(); ++i)
        if (i->key == key) return i->value;

    return -1;
}

int main()
{
    std::map<char, int> mymap;
    std::vector<mystruct> myvec;

    for (int i = 'a'; i < 'a' + 26; ++i)
    {
        mymap[i] = i - 'a';
        myvec.push_back(mystruct(i, i - 'a'));
    }

    int pre = clock();

    for (int i = 0; i < 10000000; ++i)
    {
        find(myvec, 'z');
    }

    std::cout << "linear scan: milli " << clock() - pre << "\n";

    pre = clock();

    for (int i = 0; i < 10000000; ++i)
    {
        mymap['z'];
    }

    std::cout << "map scan: milli " << clock() - pre << "\n";

    return 0;
}

Answer 1

For your example, use int value(char x) { return x - 'a'; } 对于您的示例，使用int value(char x) { return x - 'a'; } int value(char x) { return x - 'a'; }

More generalized, since the "keys" are continuous and dense, use an array (or vector) to guarantee Θ(1) access time. 更通用的是，由于“密钥”是连续且密集的，因此使用数组（或向量）来保证Θ（1）访问时间。

If you don't need the keys to be sorted, use unordered_map , which should provide amortized logarithmic improvement (ie O(log n) -> O(1)) to most operations. 如果您不需要对键进行排序，请使用unordered_map ，它应该为大多数操作提供摊销的对数改进（即O（log n） - > O（1））。

(Sometimes, esp. for small data sets, linear search is faster than hash table (unordered_map) / balanced binary trees (map) because the former has a much simpler algorithm, thus reducing the hidden constant in big-O. Profile, profile, profile.) （有时，特别是对于小数据集，线性搜索比哈希表（unordered_map）/平衡二叉树（map）更快，因为前者具有更简单的算法，因此减少了big-O中的隐藏常量。轮廓。）

Answer 2

For starters, you should probably use std::map::find if you want to compare the search times; 首先，如果你想比较搜索时间，你应该使用std::map::find ; operator[] has additional functionality over and above the regular find. operator[]具有超出常规查找的附加功能。

Also, your data set is pretty small, which means that the whole vector will easily fit into the processor cache; 此外，您的数据集非常小，这意味着整个矢量很容易适应处理器缓存; a lot of modern processors are optimised for this sort of brute-force search so you'd end up getting fairly good performance. 许多现代处理器都针对这种强力搜索进行了优化，因此您最终会获得相当不错的性能。 The map, while theoretically having better performance (O(log n) rather than O(n)) can't really exploit its advantage of the smaller number of comparisons because there aren't that many keys to compare against and the overhead of its data layout works against it. 地图虽然理论上具有更好的性能（O（log n）而不是O（n））但是不能真正利用其较少数量的比较的优势，因为没有那么多的密钥要比较它的开销和它的开销数据布局不利于它。

TBH for data structures this small, the additional performance gain from not using a vector is often negligible. 数据结构的TBH很小，不使用矢量的额外性能增益通常可以忽略不计。 The "smarter" data structures like std::map come into play when you're dealing with larger amounts of data and a well distributed set of data that you are searching for. 当您处理大量数据和正在搜索的分布良好的数据集时，像std::map这样的“更智能”的数据结构就会发挥作用。

Answer 3

If you really just have values for all entries from A to Z, why don't you use letter (properly adjusted) as the index into a vector?: 如果你真的只有从A到Z的所有条目的值，为什么不使用字母（适当调整）作为向量的索引？：

std::vector<int> direct_map;
direct_map.resize(26);

for (int i = 'a'; i < 'a' + 26; ++i) 
{
    direct_map[i - 'a']= i - 'a';
}

// ...

int find(const std::vector<int> &direct_map, char key)
{
    int index= key - 'a';
    if (index>=0 && index<direct_map.size())
        return direct_map[index];

    return -1;
}

什么容器类型提供比std :: map更好（平均）的性能？

问题描述

3 个解决方案

解决方案1
8 已采纳 2010-04-02 20:29:57

解决方案2
2 2010-04-02 20:35:27

解决方案3
2 2010-04-02 20:49:38

什么容器类型提供比std :: map更好（平均）的性能？

问题描述

3 个解决方案

解决方案1 8 已采纳 2010-04-02 20:29:57

解决方案2 2 2010-04-02 20:35:27

解决方案3 2 2010-04-02 20:49:38

解决方案1
8 已采纳 2010-04-02 20:29:57

解决方案2
2 2010-04-02 20:35:27

解决方案3
2 2010-04-02 20:49:38