[英]General advice and guidelines on how to properly override object.GetHashCode()
According to MSDN , a hash function must have the following properties: 根据MSDN ,散列函数必须具有以下属性:
If two objects compare as equal, the GetHashCode method for each object must return the same value.
如果两个对象比较相等,则每个对象的GetHashCode方法必须返回相同的值。 However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values.
但是,如果两个对象的比较不相等,则两个对象的GetHashCode方法不必返回不同的值。
The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's Equals method.
只要没有对对象状态的修改来确定对象的Equals方法的返回值,对象的GetHashCode方法必须始终返回相同的哈希代码。 Note that this is true only for the current execution of an application, and that a different hash code can be returned if the application is run again.
请注意,这仅适用于当前应用程序的执行,并且如果再次运行应用程序,则可以返回不同的哈希代码。
For the best performance, a hash function must generate a random distribution for all input.
为获得最佳性能,哈希函数必须为所有输入生成随机分布。
I keep finding myself in the following scenario: I have created a class, implemented IEquatable<T>
and overridden object.Equals(object)
. 我一直在以下场景中找到自己:我创建了一个类,实现了
IEquatable<T>
并重写object.Equals(object)
IEquatable<T>
object.Equals(object)
。 MSDN states that: MSDN声明:
Types that override Equals must also override GetHashCode ;
重写Equals的类型也必须覆盖GetHashCode; otherwise, Hashtable might not work correctly.
否则,Hashtable可能无法正常工作。
And then it usually stops up a bit for me. 然后它通常会为我停止一点。 Because, how do you properly override
object.GetHashCode()
? 因为,你如何正确覆盖
object.GetHashCode()
? Never really know where to start, and it seems to be a lot of pitfalls. 从来没有真正知道从哪里开始,这似乎是很多陷阱。
Here at StackOverflow, there are quite a few questions related to GetHashCode overriding, but most of them seems to be on quite particular cases and specific issues. 在StackOverflow中,有很多与GetHashCode重写相关的问题,但大多数问题似乎都是针对非常特殊的情况和具体问题。 So, therefore I would like to get a good compilation here.
因此,我想在这里得到一个很好的汇编。 An overview with general advice and guidelines.
概述与一般建议和指南。 What to do, what not to do, common pitfalls, where to start, etc.
该做什么,不该做什么,常见的陷阱,从哪里开始,等等。
I would like it to be especially directed at C#, but I would think it will work kind of the same way for other .NET languages as well(?). 我希望它特别针对C#,但我认为它对其他.NET语言也有同样的作用(?)。
I think maybe the best way is to create one answer per topic with a quick and short answer first (close to one-liner if at all possible), then maybe some more information and end with related questions, discussions, blog posts, etc., if there are any. 我想也许最好的方法是每个主题创建一个答案,首先是快速简短的答案(如果可能的话,尽可能接近单行),然后可能会有更多信息,并以相关问题,讨论,博客文章等结束。 ,如果有的话。 I can then create one post as the accepted answer (to get it on top) with just a "table of contents".
然后,我可以创建一个帖子作为接受的答案(将其置于顶部),只需一个“目录”。 Try to keep it short and concise.
尽量保持简洁明了。 And don't just link to other questions and blog posts.
而且不要只链接到其他问题和博客文章。 Try to take the essence of them and then rather link to source (especially since the source could disappear. Also, please try to edit and improve answers instead of created lots of very similar ones.
尝试采用它们的本质,然后链接到源(特别是因为源可能会消失。另外,请尝试编辑和改进答案,而不是创建许多非常相似的答案。
I am not a very good technical writer, but I will at least try to format answers so they look alike, create the table of contents, etc. I will also try to search up some of the related questions here at SO that answers parts of these and maybe pull out the essence of the ones I can manage. 我不是一个非常优秀的技术作家,但我至少会尝试格式化答案,使它们看起来很相似,创建目录等。我也会尝试在这里搜索一些相关的问题来回答部分问题。这些并且可能拉出我能管理的那些的本质。 But since I am not very stable on this topic, I will try to stay away for the most part :p
但由于我在这个主题上不是很稳定,所以我会尽量远离这个主题:p
When do I override object.GetHashCode
? 我什么时候覆盖
object.GetHashCode
?
Why do I have to override object.GetHashCode()? 为什么我必须覆盖object.GetHashCode()?
What are those magic numbers seen in GetHashCode implementations? 在GetHashCode实现中看到的那些神奇数字是什么?
Things that I would like to be covered, but haven't been yet: 我希望涵盖的内容,但尚未完成:
base.GetHashCode()
into your hash code? base.GetHashCode()
合并到你的哈希代码中吗? They are prime numbers. 他们是素数。 Prime numbers are used for creating hash codes because prime number maximize the usage of the hash code space.
素数用于创建哈希码,因为素数最大化了哈希码空间的使用。
Specifically, start with the small prime number 3, and consider only the low-order nybbles of the results: 具体来说,从小素数3开始,只考虑结果的低阶nybbles :
0011
0011
1010
1010
0001
0001
1000
1000
1111
1111
0010
0010
1001
1001
0000
0000
0011
0011
And we start over. 我们重新开始。 But you'll notice that successive multiples of our prime generated every possible permutation of bits in our nybble before starting to repeat.
但是你会注意到,在开始重复之前,我们的素数的连续倍数在我们的nybble中生成了每个可能的位排列。 We can get the same effect with any prime number and any number of bits, which makes prime numbers optimal for generating near-random hash codes.
我们可以使用任何素数和任意数量的位获得相同的效果,这使得素数最适合生成近似随机哈希码。 The reason we usually see larger primes instead of small primes like 3 in the example above is that, for greater numbers of bits in our hash code, the results obtained from using a small prime are not even pseudo-random - they're simply an increasing sequence until an overflow is encountered.
我们通常在上面的例子中看到较大的素数而不是像3这样的小素数的原因是,对于哈希码中更大的比特数,使用小素数得到的结果甚至不是伪随机的 - 它们只是一个增加序列直到遇到溢出。 For optimal randomness, a prime number that results in overflow for fairly small coefficients should be used, unless you can guarantee that your coefficients will not be small.
为了获得最佳随机性,应使用导致相当小系数溢出的素数,除非您可以保证系数不会很小。
Related links: 相关链接:
查看Eric Lippert的GetHashCode指南和规则
You should override it whenever you have a meaningful measure of equality for objects of that type (ie you override Equals). 只要对该类型的对象有一个有意义的相等度量(即重写等于),就应该覆盖它。 If you knew the object wasn't going to be hashed for any reason you could leave it, but it's unlikely you could know this in advance.
如果你知道对象不会因为任何原因而被删除,你可以离开它,但你不可能提前知道这一点。
The hash should be based only on the properties of the object that are used to define equality since two objects that are considered equal should have the same hash code. 哈希应该仅基于用于定义相等性的对象的属性,因为被认为相等的两个对象应该具有相同的哈希码。 In general you would usually do something like:
一般来说,你通常会这样做:
public override int GetHashCode()
{
int mc = //magic constant, usually some prime
return mc * prop1.GetHashCode() * prop2.GetHashCode * ... * propN.GetHashCode();
}
I usually assume multiplying the values together will produce a fairly uniform distribution, assuming each property's hashcode function does the same, although this may well be wrong. 我通常假设将值相乘将产生相当均匀的分布,假设每个属性的哈希码函数都是相同的,尽管这可能是错误的。 Using this method, if the objects equality-defining properties change, then the hash code is also likely to change, which is acceptable given definition #2 in your question.
使用此方法,如果对象的相等定义属性发生更改,则哈希代码也可能会更改,这在您的问题中定义#2时是可接受的。 It also deals with all types in a uniform way.
它还以统一的方式处理所有类型。
You could return the same value for all instances, although this will make any algorithms that use hashing (such as dictionarys) very slow - essentially all instances will be hashed to the same bucket and lookup will then become O(n) instead of the expected O(1). 您可以为所有实例返回相同的值,但这会使任何使用散列的算法(例如dictionarys)非常慢 - 基本上所有实例都将被散列到同一个桶,然后查找将变为O(n)而不是预期O(1)。 This of course negates any benefits of using such structures for lookup.
这当然否定了使用这种结构进行查找的任何好处。
object.GetHashCode()
? object.GetHashCode()
? Overriding this method is important because the following property must always remain true: 覆盖此方法很重要,因为以下属性必须始终保持为true:
If two objects compare as equal, the GetHashCode method for each object must return the same value.
如果两个对象比较相等,则每个对象的GetHashCode方法必须返回相同的值。
The reason, as stated by JaredPar in a blog post on implementing equality, is that 正如JaredPar在关于实现平等的博客文章中所说的那样,原因在于
Many classes use the hash code to classify an object.
许多类使用哈希代码对对象进行分类。 In particular hash tables and dictionaries tend to place objects in buckets based on their hash code.
特别是哈希表和字典倾向于根据哈希代码将对象放在存储桶中。 When checking if an object is already in the hash table it will first look for it in a bucket.
当检查对象是否已经在哈希表中时,它将首先在桶中查找它。 If two objects are equal but have different hash codes they may be put into different buckets and the dictionary would fail to lookup the object.
如果两个对象相等但具有不同的哈希码,则它们可能被放入不同的桶中,并且字典将无法查找该对象。
A) You must override both Equals and GetHashCode if you want to employ value equality instead of the default reference equality. A)如果要使用值相等而不是默认引用相等,则必须覆盖Equals和GetHashCode。 With the later, two object references compare as equal if they both refer to the same object instance.
对于后者,如果它们都引用相同的对象实例,则两个对象引用相等。 With the former they compare as equal if their value is the same even if they refer to different objects.
如果它们的值相同,即使它们引用不同的对象,它们与前者相比也是相等的。 For example, you probably want to employ value equality for Date, Money, and Point objects.
例如,您可能希望为Date,Money和Point对象使用值相等。
B) In order to implement value equality you must override Equals and GetHashCode. B)为了实现值相等,您必须重写Equals和GetHashCode。 Both should depend on the fields of the object that encapsulate the value.
两者都应该取决于封装该值的对象的字段。 For example, Date.Year, Date.Month and Date.Day;
例如,Date.Year,Date.Month和Date.Day; or Money.Currency and Money.Amount;
或Money.Currency和Money.Amount; or Point.X, Point.Y and Point.Z.
或Point.X,Point.Y和Point.Z。 You should also consider overriding operator ==, operator !=, operator <, and operator >.
您还应该考虑重写operator ==,operator!=,operator <和operator>。
C) The hashcode doesn't have to stay constant all through the object lifetime. C)哈希码不必在整个对象生存期内保持不变。 However it must remain immutable while it participates as the key in a hash.
但是,当它作为哈希中的键参与时,它必须保持不可变。 From MSDN doco for Dictionary: "As long as an object is used as a key in the Dictionary<(Of <(TKey, TValue>)>), it must not change in any way that affects its hash value."
从MSDN doco for Dictionary:“只要一个对象被用作Dictionary <(Of <(TKey,TValue>)>)中的一个键,它就不能以任何影响其哈希值的方式改变。” If you must change the value of a key remove the entry from the dictionary, change the key value, and replace the entry.
如果必须更改密钥的值,请从字典中删除条目,更改密钥值,然后替换该条目。
D) IMO, you will simplify your life if your value objects are themselves immutable. D)IMO,如果你的价值对象本身是不可变的,你将简化你的生活。
Visual Studio 2017 Visual Studio 2017
https://docs.microsoft.com/en-us/visualstudio/ide/reference/generate-equals-gethashcode-methods?view=vs-2017 https://docs.microsoft.com/en-us/visualstudio/ide/reference/generate-equals-gethashcode-methods?view=vs-2017
ReSharper ReSharper的
https://www.jetbrains.com/help/resharper/Code_Generation__Equality_Members.html https://www.jetbrains.com/help/resharper/Code_Generation__Equality_Members.html
object.GetHashCode()
? object.GetHashCode()
? Types that override Equals must also override GetHashCode ;
重写Equals的类型也必须覆盖GetHashCode; otherwise, Hashtable might not work correctly.
否则,Hashtable可能无法正常工作。
Related links: 相关链接:
It doesn't need to be based only on immutable fields. 它不需要仅基于不可变字段。 I would base it on the fields that determine the outcome of the equals method.
我将它基于确定equals方法结果的字段。
You seem to misunderstand Property #2. 你似乎误解了物业#2。 The hashcode doesn't need to stay the same thoughout the objects lifetime.
在对象生存期内,哈希码不需要保持不变。 It just needs to stay the same as long as the values that determine the outcome of the equals method are not changed.
只要确定equals方法结果的值不变,它就需要保持不变。 So logically, you base the hashcode on those values only.
因此,逻辑上,您只将哈希码基于这些值。 Then there shouldn't be a problem.
那应该不会有问题。
public override int GetHashCode()
{
return IntProp1 ^ IntProp2 ^ StrProp3.GetHashCode() ^ StrProp4.GetHashCode ^ CustomClassProp.GetHashCode;
}
Do the same in the customClass's GetHasCode
method. 在customClass的
GetHasCode
方法中执行相同的GetHasCode
。 Works like a charm. 奇迹般有效。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.