简体   繁体   English

unordered_map真的是无序的吗?

[英]Is the unordered_map really unordered?

I am very confused by the name 'unordered_map'. 我对“ unordered_map”这个名称感到非常困惑。 The name suggests that the keys are not ordered at all. 该名称表明这些键根本没有排序。 But I always thought they are ordered by their hash value. 但是我一直认为它们是按其哈希值排序的。 Or is that wrong (because the name implies that they are not ordered)? 还是那是错误的(因为名称暗示它们没有顺序)?

Or to put it different: Is this 或换句话说:这是

typedef map<K, V, HashComp<K> > HashMap;

with

template<typename T>
struct HashComp {
    bool operator<(const T& v1, const T& v2) const {
        return hash<T>()(v1) < hash<T>()(v2);
    }
};

the same as 与...相同

typedef unordered_map<K, V> HashMap;

? (OK, not exactly, STL will complain here because there may be keys k1,k2 and neither k1 < k2 nor k2 < k1. You would need to use multimap and overwrite the equal-check.) (好的,不完全是,STL在这里会抱怨,因为可能有密钥k1,k2,而k1 <k2和k2 <k1都不存在。您将需要使用multimap并覆盖均等检查。)

Or again differently: When I iterate through them, can I assume that the key-list is ordered by their hash value? 还是再次不同:当我遍历它们时,是否可以假定键列表按其哈希值排序?

In answer to your edited question, no those two snippets are not equivalent at all. 在回答您编辑过的问题时,这两个代码段根本就不相等。 std::map stores nodes in a tree structure, unordered_map stores them in a hashtable*. std::map将节点存储在树形结构中, unordered_map将其存储在哈希表中*。

Keys are not stored in order of their "hash value" because they're not stored in any order at all . 键根本不按“哈希值”的顺序存储,因为它们根本不按任何顺序存储。 They are instead stored in "buckets" where each bucket corresponds to a range of hash values. 相反,它们存储在“存储桶”中,其中每个存储桶对应于一定范围的哈希值。 Basically, the implementation goes like this: 基本上,实现是这样的:

function add_value(object key, object value) {
   int hash = key.getHash();

   int bucket_index = hash % NUM_BUCKETS;
   if (buckets[bucket_index] == null) {
       buckets[bucket_index] = new linked_list();
   }
   buckets[bucket_index].add(new key_value(key, value));
}

function get_value(object key) {
   int hash = key.getHash();

   int bucket_index = hash % NUM_BUCKETS;
   if (buckets[bucket_index] == null) {
       return null;
   }

   foreach(key_value kv in buckets[bucket_index]) {
       if (kv.key == key) {
           return kv.value;
       }
   }
}

Obviously that's a serious simplification and real implementation would be much more advanced (for example, supporting resizing the buckets array, maybe using a tree structure instead of linked list for the buckets, and so on), but that should give an idea of how you can't get back the values in any particular order. 显然,这是一个严重的简化,真正的实现会更加先进(例如,支持调整buckets数组的大小,也许使用树结构而不是buckets的链表,等等),但是这应该给您一个思路无法以任何特定顺序取回值。 See wikipedia for more information. 有关更多信息,请参见Wikipedia


* Technically, the internal implementation of std::map and unordered_map are implementation-defined, but the standard requires certain Big-O complexity for operations that implies those internal implementations *从技术上讲, std::mapunordered_map的内部实现是实现定义的,但是该标准要求操作具有一定的Big-O复杂性,这意味着这些内部实现

"Unordered" doesn't mean that there isn't a linear sequence somewhere in the implementation. “无序”并不意味着在实现中某处没有线性序列。 It means "you can't assume anything about the order of these elements". 这意味着“您不能对这些元素的顺序承担任何责任”。

For example, people often assume that entries will come out of a hash map in the same order they were put in. But they don't, because the entries are unordered. 例如,人们经常认为条目将从散列表中出来的顺序与输入顺序相同。但事实并非如此,因为条目是无序的。

As for "ordered by their hash value": hash values are generally taken from the full range of integers, but hash maps don't have 2**32 slots in them. 至于“按其哈希值排序”:哈希值通常取自整数的整个范围,但哈希映射中没有2 ** 32插槽。 The hash value's range will be reduced to the number of slots by taking it modulo the number of slots. 通过将哈希值取模为插槽数的模数,可以将其范围减小到插槽数。 Further, as you add entries to a hash map, it might change size to accommodate the new values. 此外,在将条目添加到哈希映射时,它可能会更改大小以容纳新值。 This can cause all the previous entries to be re-placed, changing their order. 这可能导致所有先前的条目被重新放置,从而改变其顺序。

In an unordered data structure, you can't assume anything about the order of the entries. 在无序数据结构中,您不能假设条目的顺序。

As the name unordered_map suggests, no ordering is specified by the C++0x standard. 就像名称unordered_map所暗示的那样,C ++ 0x标准未指定任何排序。 An unordered_map's apparent ordering will be dependent on whatever is convenient for the actual implementation. unordered_map的明显排序将取决于实际实现的方便之处。

If you want an analogy, look at the RDBMS of your choice. 如果您想进行类比,请查看您选择的RDBMS。

If you don't specify an ORDER BY clause when performing a query, the results are returned "unordered" - that is, in whatever order the database feels like. 如果在执行查询时未指定ORDER BY子句,则结果将“无序”地返回-也就是说,数据库以任何顺序返回。 The order is not specified, and the system is free to "order" them however it likes in order to get the best performance. 没有指定顺序,系统可以随意“订购”它们,但是为了获得最佳性能而喜欢。

You are right, unordered_map is actually hash ordered. 没错, unordered_map实际上是哈希排序的。 Note that most current implementations (pre TR1) call it hash_map . 请注意,大多数当前实现(TR1之前的版本)都将其称为hash_map

The IBM C/C++ compiler documentation remarks that if you have an optimal hash function, the number of operations performed during lookup, insertion, and removal of an arbitrary element does not depend on the number of elements in the sequence , so this mean that the order is not so unordered... IBM C / C ++编译器文档指出, 如果您具有最佳的哈希函数,则在查找,插入和除去任意元素的过程中执行的操作数不取决于序列中的元素数 ,因此这意味着订单不是那么无序...

Now, what does it mean that it is hash ordered ? 现在,这是哈希排序是什么意思? As an hash should be unpredictable, by definition you can't take any assumption about the order of the elements in the map. 由于哈希值不可预测,因此根据定义,您不能对映射中的元素顺序进行任何假设。 This is the reason why it has been renamed in TR1: the old name suggested an order. 这就是它在TR1中被重命名的原因:旧名称建议使用顺序。 Now we know that an order is actually used, but you can disregard it as it is unpredictable. 现在我们知道实际使用了一个订单,但是您可以忽略它,因为它是不可预测的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM