简体   繁体   English

快速插入STL映射

[英]Fast Insertions into STL map

The following code is from cplusplus.com . 以下代码来自cplusplus.com It has two inserts labelled as efficient and inefficient. 它有两个标记为有效和无效的插入物。 I think that the efficient one should give the hint to be mymap.begin() + 1 , because *(mymap.begin() + 1) is 'z' and z will follow b . 我认为高效的人应该给我的提示是mymap.begin() + 1 ,因为*(mymap.begin() + 1)'z'并且z将跟随b

The function optimizes its insertion time if position points to the element that will follow the inserted element (or to the end, if it would be the last). 如果位置指向将要插入的元素之后的元素(或最后一个,如果指向末尾),则该函数将优化其插入时间。

The best hint for inserting 'c' would be *(mymap.begin() + 2) , because it has to pass 'a' and 'b' . 插入'c'的最佳提示是*(mymap.begin() + 2) ,因为它必须传递'a''b'
Right or wrong? 对还是错? I tried timing my proposed code and comparing it to the 'efficient' one here, but I see no difference. 我尝试对建议的代码进行计时,并将其与此处的“高效”代码进行比较,但我发现没有区别。 Probably because I have a million tabs open and music playing too, and because this is a trivial example. 可能是因为我打开了100万个标签并且还播放了音乐,并且这是一个简单的例子。

  std::map<char,int> mymap;

  // first insert function version (single parameter):
  mymap.insert ( std::pair<char,int>('a',100) );
  mymap.insert ( std::pair<char,int>('z',200) );

  // second insert function version (with hint position):
  std::map<char,int>::iterator it = mymap.begin();
  mymap.insert (it, std::pair<char,int>('b',300));  // max efficiency inserting
  mymap.insert (it, std::pair<char,int>('c',400));  // no max efficiency inserting

The "efficient" version is only efficient if you provide it a good hint. “高效”版本只有在您提供了很好的提示后才有效。 Your hint ( .begin ) is wrong. 您的提示( .begin )错误。 Now, in a container with just two elements, you can't be very wrong, so the damage is limited. 现在,在只有两个元素的容器中,您不会错,因此损坏是有限的。

The specification of the semantics of hinted inserts changed with C++11 (as indicated in this answer ). 提示插入的语义规范随C ++ 11更改(如本答案所示 )。 See DR 233 for the resolution and N1780 for part of the discussion which lead to that resolution. 有关决议,请参阅DR 233 ,有关导致该决议的部分讨论,请参见N1780

The defect report and discussion paper are primarily about std::multimap and std::multiset , in which duplicate keys are allowed. 缺陷报告和讨论文件主要是关于std::multimapstd::multiset ,其中允许重复的密钥。 In that case, if the "hint" refers to an element with a key equal to the key being inserted, then the new element could be inserted either before or after the hint and the pre-C++11 standard left that ambiguous. 在那种情况下,如果“提示”指的是具有与插入的键相同的键的元素,则可以在提示之前或之后插入新元素,并且C ++ 11之前的标准不明确。 DR233 makes the decision deterministic, but it also can be read as affecting the specification of behaviour for std::map and std::set . DR233决定性地做出决定,但也可以理解为影响std::mapstd::set行为规范。

In the original specification (prior to C+11), the standard simply said "iterator p is a hint pointing to where the insert should start to search," which is not very specific about whether the hint should point before or after the insertion point. 在原始规范(C + 11之前的版本)中,该标准仅表示“迭代器p是指向插入物应开始搜索的位置的提示”,它对提示应该指向插入点之前还是之后的位置不是很明确。 (Nor does it says anything about how the search proceeds in case the hint is wrong, since the new element must be inserted at a correct position regardless of the hint.) However, the complexity of the operation was documented as being "logarithmic in general, but amortized constant if t is inserted right after p ". (也没有说明在提示错误的情况下如何进行搜索,因为无论提示如何都必须将新元素插入正确的位置。)但是,该操作的复杂性被记录为“通常是对数的” ,但如果t紧接在p “之后插入t则为摊余常数。

That complexity specification is obviously wrong on two counts: first, it does not insist on constant time insertion if t is not inserted (because the hint points to an element whose key compares equal), but any reasonable implementation could hardly fail to be constant-time in this case. 该复杂性规范显然有两个方面的错误:首先,如果未插入 t ,它就不坚持恒定时间插入(因为提示指向键比较相等的元素),但是任何合理的实现都很难使它不变-在这种情况下的时间。 Second, if the new element is to be inserted at the beginning of the container, there is no possible way of specifying a hint prior to the insertion point. 其次,如果要在容器的开头插入新元素,则无法在插入点之前指定提示。

In fact, major implementations of the standard library actually expected the hint to point just after the insertion point, although most also checked to see if it was just before. 实际上,标准库的主要实现实际上希望该提示指向插入点之后,尽管大多数还检查该提示是否刚好位于插入点之后。 So existing practice was to provide amortized constant time complexity in cases not required by the standard (which, of course, is permitted), with at least one widely-used implementation failing to provide the required complexity. 因此,现有做法是在标准不需要的情况下(当然是允许的)提供摊销的恒定时间复杂度,至少有一种广泛使用的实现未能提供所需的复杂度。

So the code in cplusplus.com is, at best, imprecise, and definitely fails to describe a normal use case for hinted insertion. 因此, cplusplus.com中的代码充其量是不精确的,并且绝对不能描述提示插入的正常用例。

Suppose that it is expensive to construct a mapped value for a given key. 假设为给定键构造映射值很昂贵。 (Perhaps the map memoizes an expensive function and there is no cheap default constructor for the mapped value.) In that case, you probably would want to check to see if the map already contains the key before going to the trouble to compute the corresponding value which would need to be inserted. (也许地图记住了一个昂贵的函数,并且映射的值没有便宜的默认构造函数。)在这种情况下,您可能想在麻烦计算相应值之前先检查地图是否已经包含键。这将需要插入。 A naive implementation would be something like: 天真的实现将是这样的:

if (mymap.find(key) == mymap.end())
   mymap[key] = expensive_function(key);
// See Note 1 for another slightly more efficient variant

The result is that the same logarithmic search is done twice if the key is not present. 结果是,如果键不存在,则相同的对数搜索将执行两次。 Of course, the extra cost of the unnecessary search is probably trivial compared with the cost of expensive_function , but, still, it seems like there would be a better solution. 当然,不必要的搜索所产生的额外费用与expensive_function function相比可能是微不足道的,但是,似乎仍然有更好的解决方案。 Which there is: we do the first search with std::map::lower_bound , leading to the only slightly more complex code: 这是什么:我们使用std::map::lower_bound进行第一次搜索,从而导致仅有略微复杂的代码:

auto where = mymap.lower_bound(key);
if (where == mymap.end() || where->first != key) 
  where = mymap.emplace_hint(where, key, expensive_function(key));
/* Here, 'where' points to the element with the specified key */

(I used std::map::emplace_hint -- available since C++11 -- rather than insert in part to attempt to avoid an unnecessary copy, as well as to avoid cluttering the code with std::make_pair .) (我使用了std::map::emplace_hint自C ++ 11起可用-而不是部分insert以试图避免不必要的复制,并避免使用std::make_pair造成代码混乱。)

Notes 笔记

  1. Instances of that code are very easy to find. 该代码的实例很容易找到。 Many go on to reference mymap[key] in order to use the stored value, adding yet another unnecessary logarithmic search; 许多人继续引用mymap[key]来使用存储的值,并添加了另一个不必要的对数搜索; better code would be: 更好的代码是:

     auto where = mymap.find(key); if (where == mymap.end()) where = mymap.emplace(key, expensive_function(key)).first; 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM