简体繁体 English

JVM如何确保System.identityHashCode（）永远不会改变？

[英]How does the JVM ensure that System.identityHashCode() will never change?

原文 2009-06-30 10:57:42 4 5 java/ jvm/ hashcode/ heap-memory

Typically the default implementation of Object.hashCode() is some function of the allocated address of the object in memory (though this is not mandated by the JLS ). 通常， Object.hashCode()的默认实现是内存中对象的已分配地址的某个函数（尽管这不是JLS强制要求的）。 Given that the VM shunts objects about in memory, why does the value returned by System.identityHashCode() never change during the object's lifetime? 鉴于VM在内存中分流对象，为什么System.identityHashCode()返回的值在对象的生命周期内永远不会改变？

If it is a "one-shot" calculation (the object's hashCode is calculated once and stashed in the object header or something), then does that mean it is possible for two objects to have the same identityHashCode (if they happen to be first allocated at the same address in memory)? 如果它是“一次性”计算（对象的hashCode计算一次并隐藏在对象头或其他东西中），那么这是否意味着两个对象可能具有相同的identityHashCode （如果它们碰巧首先被分配在内存中的同一地址）？

5 个解决方案

Modern JVMs save the value in the object header. 现代JVM将值保存在对象标头中。 I believe the value is typically calculated only on first use in order to keep time spent in object allocation to a minimum (sometimes down to as low as a dozen cycles). 我认为该值通常仅在首次使用时计算，以便将在对象分配中花费的时间保持在最小值（有时低至十几个周期）。 The common Sun JVM can be compiled so that the identity hash code is always 1 for all objects. 可以编译公共Sun JVM，以便所有对象的标识哈希码始终为1。

Multiple objects can have the same identity hash code. 多个对象可以具有相同的标识哈希码。 That is the nature of hash codes. 这是哈希码的本质。

In answer to the second question, irrespective of the implementation, it is possible for multiple objects to have the same identityHashCode. 在回答第二个问题时，无论实现如何，多个对象都可以具有相同的identityHashCode。

See bug 6321873 for a brief discussion on the wording in the javadoc, and a program to demonstrate non-uniqueness. 有关javadoc中措辞的简短讨论，请参阅错误6321873 ，以及演示非唯一性的程序。

The header of an object in HotSpot consists of a class pointer and a "mark" word. HotSpot中对象的标题由类指针和“标记”字组成。

The source code of the data structure for the mark word can be found the markOop.hpp file. 标记字的数据结构的源代码可以在markOop.hpp文件中找到。 In this file there is a comment describing memory layout of the mark word: 在此文件中有一个描述标记字的内存布局的注释：

hash:25 ------------>| age:4 biased_lock:1 lock:2 (normal object)

Here we can see that the the identity hash code for normal Java objects on a 32 bit system is saved in the mark word and it is 25 bits long. 在这里我们可以看到32位系统上普通Java对象的标识哈希码保存在标记字中，长度为25位。

The general guideline for implementing a hashing function is : 实现散列函数的一般准则是：

the same object should return a consistent hashCode , it should not change with time or depend on any variable information (eg an algorithm seeded by a random number or values of mutable member fields 同一个对象应返回一致的hashCode ，它不应随时间变化或依赖于任何变量信息（例如，由随机数或可变成员字段值播种的算法）
the hash function should have a good random distribution , and by that I mean if you consider the hashcode as buckets, 2 objects should map to different buckets (hashcodes) as far as possible. 哈希函数应具有良好的随机分布 ，并且我的意思是，如果您将哈希码视为桶，则2个对象应尽可能映射到不同的桶（哈希码）。 The possibility that 2 objects would have the same hashcode should be rare - although it can happen. 2个对象具有相同哈希码的可能性应该是罕见的 - 尽管它可能发生。