简体   繁体   English

Java中的哈希键

[英]Hashing Keys in Java

In java, when I use a String as a key for Hashmap I get a little different result than when I use the string hashcode as a key in the HashMap. 在Java中,当我使用字符串作为Hashmap的键时,与使用字符串哈希码作为HashMap的键时,得到的结果略有不同。

Any insight? 有见识吗?

when I use the string hashcode as a key in the HashMap. 当我使用字符串哈希码作为HashMap中的键时。

You mustn't use the hash code itself as the key. 不得使用哈希码本身作为密钥。 Hash codes aren't intended to be unique - it's entirely permitted for two non-equal values to have the same hash code. 哈希码并不是唯一的-完全允许两个不相等的值具有相同的哈希码。 You should use the string itself as a key. 您应该使用字符串本身作为键。 The map will then compare hash codes first (to narrow down the candidate matches quickly) and then compare with equals for genuine string equality. 然后,映射将首先比较哈希码(以快速缩小候选匹配项的范围),然后与equals进行比较,以实现真正的字符串相等性。

Of course, that's assuming your code really is as your question makes it, eg 当然,这是假设您的代码确实符合您的问题,例如

HashMap<String, String> goodMap = new HashMap<String, String>();
goodMap.put("foo", "bar");

HashMap<Integer, String> badMap = new HashMap<Integer, String>();
badMap.put("foo".hashCode(), "bar");

If that's really what your code looks like, just use HashMap<String, String> instead. 如果您的代码确实如此,则只需使用HashMap<String, String>

From the docs for Object.hashCode() (emphasis mine): Object.hashCode()的文档中(重点是我):

The general contract of hashCode is: hashCode的一般约定为:

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. 在Java应用程序的执行过程中,只要在同一对象上多次调用它,则hashCode方法必须一致地返回相同的整数,前提是未修改该对象的equals比较中使用的信息。 This integer need not remain consistent from one execution of an application to another execution of the same application. 从一个应用程序的执行到同一应用程序的另一执行,此整数不必保持一致。
  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. 如果根据equals(Object)方法,两个对象相等,则在两个对象中的每个对象上调用hashCode方法必须产生相同的整数结果。
  • It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. 根据equals(java.lang.Object)方法,如果两个对象不相等,则不需要在两个对象中的每个对象上调用hashCode方法必须产生不同的整数结果。 However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables. 但是,程序员应该意识到,为不相等的对象生成不同的整数结果可能会提高哈希表的性能。

Of course. 当然。 Different Strings can have the same hashCode, so if you store two such strings as keys in a map, you'll have two entries (since the strings are different). 不同的字符串可以具有相同的hashCode,因此,如果将两个这样的字符串作为键存储在映射中,则将有两个条目(因为字符串不同)。 Whareas if you use their hashCode as the key, you'll have only one entry (since their hashCode is the same). 如果您将其hashCode用作键,则将只有一个条目(因为它们的hashCode相同)。

The hashCode isn't used to tell if two keys are equal. hashCode不能用来判断两个键是否相等。 It's only used to assign a bucket to the key. 它仅用于为密钥分配存储桶。 Once the bucket is found, every key contained in the bucket is compared to the new key with equals, and the key is added to the bucket if no equal key can be found. 找到存储桶后,会将存储桶中包含的每个密钥与带有等号的新密钥进行比较,如果找不到相等的密钥,则将该密钥添加到存储桶中。

The problem is that, even if two objects are different, doesn't mean that their hashcodes are also different. 问题是,即使两个对象不同,也不意味着它们的哈希码也不同。

Two different objects can share the same hashcode. 两个不同的对象可以共享相同的哈希码。 So, you shouldn't have them as a HashMap key. 因此,您不应该将它们作为HashMap密钥。

Also, because hash codes returned from Object.hashCode() method are of type int , you can only have 2^32 different values. 另外,由于从Object.hashCode()方法返回的哈希码的类型为int ,因此只能具有2^32不同的值。 That's why you will have "collisions" depending on the hashing algorithm, for different objects. 因此,对于不同的对象,您将取决于散列算法而产生“冲突”。

In short: - 简而言之: -

!obj.equals(obj1) doesn't ensures that obj.hashCode() != obj1.hashCode() . !obj.equals(obj1)不能确保obj.hashCode() != obj1.hashCode()

HashCodes can be same or different for same String so be careful with that. HashCodes可以对于相同的String相同或不同,因此要小心。 May be this is why you are getting a different result. 也许这就是为什么您获得不同的结果。

Here's another SO question on it. 这是另一个SO问题 See Jon Skeet's accepted answer. 请参阅Jon Skeet接受的答案。

You can use the hash code as the key only if the hash function is a perfect hash (see eg GPERF ). 仅当哈希函数是完美的哈希( 例如,请参见GPERF )时,才可以将哈希码用作键。 As long as your key objects don't reside in memory you are correct that you will save memory. 只要关键对象不驻留在内存中,就可以节省内存。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM