简体   繁体   English

覆盖GetHashCode()

[英]Overriding GetHashCode()

In this article , Jon Skeet mentioned that he usually uses this kind of algorithm for overriding GetHashCode() . 本文中 ,Jon Skeet提到他通常使用这种算法来覆盖GetHashCode()

public override int GetHashCode()
{
  unchecked // Overflow is fine, just wrap
  {
    int hash = 17;
    // Suitable nullity checks etc, of course :)
    hash = hash * 23 + Id.GetHashCode();
    return hash;
  }
}

Now, I've tried using this, but Resharper tells me that the method GetHashCode() should be hashing using only read-only fields (it compiles fine, though). 现在,我尝试使用它,但Resharper告诉我,方法GetHashCode()应该只使用只读字段进行散列(尽管编译很好)。 What would be a good practice, because right now I can't really have my fields to be read-only? 什么是好的做法,因为现在我真的不能让我的字段是只读的?

I tried generating this method by Resharper, here's the result. 我尝试通过Resharper生成这个方法,这是结果。

public override int GetHashCode()
{
  return base.GetHashCode();
}

This doesn't contribute much, to be honest... 这没什么贡献,说实话......

If all your fields are mutable and you have to implement GetHashCode method, I am afraid this is the implementation you would need to have. 如果您的所有字段都是可变的并且您必须实现GetHashCode方法,那么我担心这是您需要的实现。

public override int GetHashCode() 
{ 
    return 1; 
} 

Yes, this is inefficient but this is at least correct. 是的,这是低效的,但这至少是正确的。

The problem is that GetHashCode is being used by Dictionary and HashSet collections to place each item in a bucket. 问题是Dictionary和HashSet集合正在使用GetHashCode将每个项目放在存储桶中。 If hashcode is calculated based on some mutable fields and the fields are really changed after the object is placed into the HashSet or Dictionary, the object can no longer be found from the HashSet or Dictionary. 如果基于某些可变字段计算哈希码,并且在将对象放入HashSet或Dictionary后实际更改了字段,则无法再从HashSet或Dictionary中找到该对象。

Note that with all the objects returning the same HashCode 1, this basically means all the objects are being put in the same bucket in the HashSet or Dictionary. 请注意,对于返回相同HashCode 1的所有对象,这基本上意味着所有对象都放在HashSet或Dictionary中的同一个桶中。 So, there is always only one single bucket in the HashSet or Dictionary. 因此,HashSet或Dictionary中始终只有一个存储桶。 When trying to lookup the object, it will do a equality check on each of the objects inside the only bucket. 尝试查找对象时,它将对唯一存储桶中的每个对象执行相等检查。 This is like doing a search in a linked list. 这就像在链表中进行搜索一样。

Somebody may argue that implementing the hashcode based on mutable fields can be fine if we can make sure fields are never changed after the objects added to HashCode or Dictionary collection. 有人可能会争辩说,如果我们可以确保在将对象添加到HashCode或Dictionary集合后永远不会更改字段,那么基于可变字段实现哈希码就可以了。 My personal view is that this is error-prone. 我个人认为这很容易出错。 Somebody taking over your code two years later might not be aware of this and breaks the code accidentally. 两年后有人接管你的代码可能不会意识到这一点并且意外地破坏了代码。

Please note that your GetHashCode must go hand in hand with your Equals method. 请注意,您的GetHashCode必须与您的Equals方法齐头并进。 And if you can just use reference equality (when you'd never have two different instances of your class that can be equal) then you can safely use Equals and GetHashCode that are inherited from Object. 如果你可以只使用引用相等(当你的类中没有两个不同的实例可以相等时),那么你可以安全地使用继承自Object的Equals和GetHashCode。 This would work much better than simply return 1 from GetHashCode. 这比仅仅从GetHashCode return 1要好得多。

I personally tend to return a different numeric value for each implementation of GetHashCode() in a class which has no immutable fields. 我个人倾向于在没有不可变字段的类中为GetHashCode()每个实现返回不同的数值。 This means if I have a dictionary containing different implementing types, there is a chance the different instances of different types will be put in different buckets. 这意味着如果我有一个包含不同实现类型的字典,则有可能将不同类型的不同实例放在不同的存储区中。

For example 例如

public class A
{
    // TODO Equals override

    public override int GetHashCode()
    {
        return 21313;
    } 
}

public class B
{
    // TODO Equals override

    public override int GetHashCode()
    {
        return 35507;
    } 
}

Then if I have a Dictionary<object, TValue> containing instances of A , B and other types , the performance of the lookup will be better than if all the implementations of GetHashCode returned the same numeric value. 然后,如果我有一个Dictionary<object, TValue>包含AB和其他类型的实例,那么查找的性能将比GetHashCode所有实现返回相同的数值更好。

It should also be noted that I make use of prime numbers to get a better distribution. 还应该注意的是,我使用素数来获得更好的分布。

As per the comments I have provided a LINQPad sample here which demonstrates the performance difference between using return 1 for different types and returning a different value for each type. 根据评论我在这里提供了一个LINQPad示例它演示了为不同类型使用return 1和为每种类型返回不同值之间的性能差异。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM