简体   繁体   English

不同对象的相同 GetHashCode()

[英]Same GetHashCode() for different objects

After executing this piece of code:执行这段代码后:

int a = 50;
float b = 50.0f;
Console.WriteLine(a.GetHashCode() == b.GetHashCode());

We get False , which is expected, since we are dealing with different objects, hence we should get different hashes.我们得到False ,这是意料之中的,因为我们正在处理不同的对象,因此我们应该得到不同的哈希值。

However, if we execute this:但是,如果我们执行此操作:

int a = 0;
float b = 0.0f;
Console.WriteLine(a.GetHashCode() == b.GetHashCode());

We get True .我们得到True Both obejcts return the same hash code: 0 .两个对象都返回相同的哈希码: 0

Why does this happen?为什么会发生这种情况? Aren't they supposed to return different hashes?他们不应该返回不同的哈希值吗?

The GetHashCode of System.Int32 works like: System.Int32GetHashCode工作方式如下:

public override int GetHashCode()
{
    return this;
}

Which of course with this being 0 , it will return 0 .这当然是0 ,它将返回0

System.Single 's ( float is alias) GetHashCode is: System.Single的( float是别名) GetHashCode是:

public unsafe override int GetHashCode()
{
    float num = this;
    if (num == 0f)
    {
        return 0;
    }
    return *(int*)(&num);
}

Like you see, at 0f it will return 0 .如您所见,在0f它将返回0

Program used is ILSpy.使用的程序是ILSpy。

From MSDN Documentation :MSDN 文档

Two objects that are equal return hash codes that are equal.两个相等的对象返回相等的哈希码。 However, the reverse is not true: equal hash codes do not imply object equality, because different (unequal) objects can have identical hash codes.然而,反之则不然:相等的散列码并不意味着对象相等,因为不同(不相等)的对象可以具有相同的散列码。

Objects that are conceptually equal are obligated to return the same hashes.概念上相等的对象有义务返回相同的哈希值。 Objects that are different are not obligated to return different hashes.不同的对象没有义务返回不同的哈希值。 That would only be possible if there were less than 2^32 objects that could ever possibly exist.只有当可能存在的对象少于 2^32 个时,这才有可能。 There are more than that.还有更多。 When objects that are different result in the same hash it is called a "collision".当不同的对象产生相同的散列时,它被称为“冲突”。 A quality hash algorithm minimizes collisions as much as possible, but they can never be removed entirely.质量散列算法尽可能地减少冲突,但它们永远无法完全消除。

Why should they?他们为什么要? Hash codes are a finite set;哈希码是一个有限集; as many as you can fit in an Int32 .尽可能多地放入Int32 There are many many doubles that will have the same hash code as any given int or any other given double.有许多双精度数与任何给定的 int 或任何其他给定的双精度数具有相同的哈希码。

Hash codes basically have to follow two simple rules:哈希码基本上必须遵循两个简单的规则:

  1. If two objects are equal, they should have the same hash code.如果两个对象相等,则它们应该具有相同的哈希码。
  2. If an object does not mutate its internal state then the hash code should remain the same.如果一个对象没有改变其内部状态,那么哈希码应该保持不变。

Nothing obliges two objects that are not equal to have different hash codes;没有什么要求两个不相等的对象具有不同的哈希码; it is mathematically impossible.这在数学上是不可能的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM