简体繁体 English

stl :: map of stl :: sets的效率

[英]Efficieny of stl::map of stl::sets

原文 2013-09-02 23:17:34 2 1 c++/ boost/ stl/ boost-icl

I believe I'd like to use boost::icl::interval_map to solve a problem (described here , I'll post a complete answer if interval_maps ultimately work.) 我相信我想使用boost :: icl :: interval_map来解决问题（这里描述的，如果interval_maps最终有效，我会发布一个完整的答案。）

I want to use an interval_map<unsigned long long, set<foo*>> , but the documentation for boost::icl mentions that there are potential efficiency problems (below from ). 我想用一个interval_map<unsigned long long, set<foo*>> ，但的boost :: ICL的文件中提到，有潜在的效率问题（下面的）。

We are introducing interval_maps using an interval map of sets of strings, because of it's didactic advantages. 我们使用字符串集的区间映射来引入interval_maps，因为它具有教学上的优点。 The party example is used to give an immediate access to the basic ideas of interval maps and aggregate on overlap. 派对示例用于立即访问区间图的基本思想，并在重叠时进行聚合。 For real world applications, an interval_map of sets is not necessarily recommended. 对于实际应用程序，不一定建议使用集合的interval_map。 It has the same efficiency problems as a std::map of std::sets. 它与std :: sets的std :: map具有相同的效率问题。 There is a big realm though of using interval_maps with numerical and other efficient data types for the associated values. 虽然使用interval_maps与关联值的数值和其他有效数据类型有一个很大的领域。

What are the efficiency issues with std::map of std::sets? std :: map的std :: sets有哪些效率问题？ and How can I avoid them? 我该如何避免它们？

1 个解决方案

Both std::map<K, V> and std::set<V> are node based containers linked by pointers. std::map<K, V>和std::set<V>都是由指针链接的基于节点的容器。 Traversing them has nice complexity guarantees (ie, each individual operation is at most O(log n)) but you actually need fairly sizable containers for the complexity to matter compared, eg, to a std::vector<std::pair<K, V>> especially when K and V are fundamental types. 遍历它们具有很好的复杂性保证（即，每个单独的操作最多为O（log n））但是实际上你需要相当大的容器以便比较重要，比如std::vector<std::pair<K, V>>特别是当K和V是基本类型时。 The main performance issue with node based containers is that they are laid out more or less randomly in memory while contemporary CPUs like to access data which is clustered in some form. 基于节点的容器的主要性能问题是它们在内存中或多或少地随机布局，而现代CPU喜欢访问以某种形式聚集的数据。

Of course, as usual you'll need to measure the times obtained between different implementations on fairly realistic data sets to determine which data structure yields the best performance. 当然，像往常一样，您需要测量在相当实际的数据集上的不同实现之间获得的时间，以确定哪个数据结构产生最佳性能。