简体   繁体   English

HashSet中元素的顺序如何工作?

[英]How does the order of elements in a HashSet work?

I understand that the order of elements in a HashSet is supposed to be arbitrary. 据我所知,HashSet中元素的顺序应该是任意的。 But out of curiosity, could anyone tell me exactly how the order is determined? 但出于好奇,有人能告诉我订单是如何确定的吗?

I noticed that when I insert two elements (say A and B), the order would come out A, B , then re-executing the same code again would give me B, A , then re-excecuting it the third time would give me A, B . 我注意到当我插入两个元素(比如A和B)时,顺序会出现A, B ,然后再次执行相同的代码会给我B, A ,然后重新执行它第三次会给我A, B

I mean, that's kind of un-deterministic, and a bit weird. 我的意思是,这有点不确定,有点奇怪。

The order is determined by the Hashing algorithm used within the Hash Map/Set, the exact settings of that Map and the Hashcodes of the objects. 顺序由哈希映射/集合中使用的哈希算法,该映射的精确设置和对象的哈希代码确定。

If your objects have consistent hashcodes over multiple runs (Strings for example) and are placed in the same order into a map with the same settings then in general they would come out in the same order each time. 如果您的对象在多次运行(例如字符串)中具有一致的哈希码并且以相同的顺序放置到具有相同设置的地图中,那么通常它们每次都会以相同的顺序出现。 If they don't then they won't. 如果他们不这样做,他们就不会。

The source for HashMap can be seen here: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java 可以在这里看到HashMap的源代码: http//grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java

In fact an interesting quote from that source is: 事实上,该来源的一个有趣的引用是:

This class makes no guarantees as to the order of the map; 这个类不保证地图的顺序; in particular, it does not guarantee that the order will remain constant over time. 特别是,它不保证订单会随着时间的推移保持不变。

So not only may the order be different each time your program runs, but in fact the API itself makes no guarantee that the order will remain constant even inside one run of the program! 因此,每次程序运行时,订单不仅可能不同,而且实际上API本身并不保证即使在程序的一次运行中订单也会保持不变!

"Un-deterministic and a bit weird" is a good description of ordering of a HashMap - and is actually pretty much what the docs say. “不确定性和有点奇怪”是对HashMap排序的一个很好的描述 - 实际上几乎是文档所说的。 If you want ordering use either LinkedHashMap or TreeMap . 如果要订购,请使用LinkedHashMapTreeMap If you don't want ordering then don't worry about it, by having the ordering being effectively random HashMap is giving you extremely fast responses from the methods who's behavior it does guarantee! 如果您不想订购,那么不要担心它,通过使排序有效随机HashMap为您提供极其快速的响应来自它确保行为的方法!

In principle there are two contributing factors: 原则上有两个因素:

  1. Hashcode of your keys might be non-deterministic, this will be the case when you use default hashCode implementation, which relies on memory location 密钥的哈希代码可能是不确定的,当您使用默认的hashCode实现时会出现这种情况,该实现依赖于内存位置

  2. HashSet itself can be non deterministic, take a look at HashMap.initHashSeedAsNeeded (HashSet uses HashMap in standard Oracle SDK as underlying datastructure), depending on some factors it can use sun.misc.Hashing.randomHashSeed(this) to initialize hashSeed field which is then used when computing hashCode of a key HashSet本身可以是非确定性的,看看HashMap.initHashSeedAsNeeded (HashSet在标准Oracle SDK中使用HashMap作为底层数据结构),根据一些因素,它可以使用sun.misc.Hashing.randomHashSeed(this)来初始化hashSeed字段,这是然后在计算密钥的hashCode时使用

Randomization can be important to achieve probabilistic performance guaranties. 随机化对于实现概率性能保证非常重要。 This is what javadoc says for hashSeed: 这就是javadoc对hashSeed所说的:

/** * A randomizing value associated with this instance that is applied to / ** *与此实例关联的随机值
* hash code of keys to make hash collisions harder to find. *使哈希冲突难以找到的密钥哈希码。 If 0 then 如果为0那么
* alternative hashing is disabled. *禁用替代哈希。 */ * /

The order will not change (in practice) unless you add / remove something to your HashSet . 除非您向HashSet添加/删除内容,否则订单不会更改(在实践中)。

The order is based on the internal hashtable buckets. 订单基于内部哈希表桶。 And that depends on both the hashCode() of an object and the size of the hashtable. 这取决于对象的hashCode()和哈希表的大小。

Simplified example: 简化示例:

A's hashcode is 10, B's hashCode is 11. The hastable has size 2. The mapping from hash code to position in hashtable would be purely based on the last bit, ie even hashcodes go into table[0], odd ones into table[1]. A的哈希码是10,B的hashCode是11. hastable的大小为2.哈希码到哈希表中的位置的映射完全基于最后一位,即使哈希码进入表[0],奇数进入表[1] ]。

table[0] = { A }
table[1] = { B }

Iterating over those values would most likely be A, B now. 迭代这些值很可能现在是A,B。 And that result should be reproducible each time as long as table size stays the same. 只要表格大小保持不变,每次结果都应该是可重复的。

Adding a third element C with hashCode 12 would (when not resizing the table) add it to bucket #0 as well. 使用hashCode 12添加第三个元素C(当不调整表的大小时)也将它添加到桶#0。

table[0] = { A, C }
table[1] = { B }

So your iteration would be A, C, B. Or depending in whether you inserted A before C: C, A, B 所以你的迭代将是A,C,B。 或者取决于你是否在C:C,A,B之前插入A.

Adding elements will in practice resize the table and re-hash using an adjusted mapping. 实际上,添加元素将调整表的大小并使用调整后的映射重新哈希。 Eg table size would be doubled and the last 2 bits could be used to determine the bucket 例如,表大小将加倍,最后2位可用于确定存储桶

table[0] = { C }
table[1] = {   }
table[2] = { A }
table[3] = { B }

And the order would have changed completely by adding just 1 element. 只需添加1个元素,订单就会完全改变。

Only HashSet keeps and garatuees no order, even no arbitrary order ( Why can hashCode() return the same value for different objects in Java? )! 只有HashSet保持和garatuees没有顺序,甚至没有任意顺序( 为什么hashCode()为Java中的不同对象返回相同的值? )! Dont force an order there! 不要强迫订单! Serialize and Deserialize them and the original order will be destroyed. 序列化和反序列化它们,原始订单将被销毁。

Use LinkedHashSet instead of HashSet. 使用LinkedHashSet而不是HashSet。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM