简体   繁体   English

此GetHashCode方法中的移位值如何改善哈希?

[英]How does bit shifting values in this GetHashCode method improve hashing?

I found two data types with hash code methods in a code base I'm working on, which I don't fully understand why they were chosen: 我在正在使用的代码库中找到了两种带有哈希码方法的数据类型,但我不完全理解为什么选择它们:

public override int GetHashCode()
{
    return x.GetHashCode() ^ y.GetHashCode() << 2;
}

public override int GetHashCode()
{
    return x.GetHashCode() ^ y.GetHashCode() << 2 ^ z.GetHashCode() >> 2;
}

How does the bit shifting operation make these hash values any better? 移位操作如何使这些哈希值更好?

Let's say you have a Point data structure represented by a x and y variable. 假设您有一个由xy变量表示的Point数据结构。 Without the bit shifting the value of hash code for (1,0) would be 1 , and the hash code for (0,1) would also be 1 . 如果不进行位移,则(1,0)的哈希码的值为1 ,而(0,1)的哈希码的1 Now do the same thing with the bit shifting, for (1,0) we get a hash code of 1 , but for (0,1) we now get a hash code of 4 现在做同样的事情与比特移位,为(1,0)我们得到的哈希码1 ,但对(0,1)我们现在得到的哈希码4

What the bit shifting provides is if you have the same inputs but in a different order you want to get different hash codes, that way (1,0) and (0,1) don't end up falling in to the same hash bucket and degrading your hashset/dictionary performance. 移位提供的是,如果您具有相同的输入,但您想以不同的顺序获得不同的哈希码,则(1,0)(0,1)不会最终落入同一哈希桶中并降低您的哈希集/字典性能。

Normally you would do a much larger offset than just left shifting twice. 通常情况下,偏移量会比左移两次大得多。 Bitshifting also causes data to get truncated if dealing with hash codes near Int32.MaxValue . 如果处理Int32.MaxValue附近的哈希码,则位移位还会导致数据被截断。 Here is the pattern I normally use 这是我通常使用的模式

public override int GetHashCode()
{
    unchecked
    {
        var hashCode = X;
        hashCode = (hashCode*397) ^ Y;
        hashCode = (hashCode*397) ^ Z;
        return hashCode;
    }
}

(this is the default implementation that comes with Resharper's "Insert Comparison Method" feature. To add more fields you just keep doing hashCode = (hashCode*397) ^ XXXXXXX ) (这是Resharper的“插入比较方法”功能随附的默认实现。要添加更多字段,您只需继续执行hashCode = (hashCode*397) ^ XXXXXXX

By using * with unchecked instead of << any value that is larger than Int32.MaxValue just overflows without error. 通过将*unchecked而不是<<一起使用,任何大于Int32.MaxValue值都将Int32.MaxValue溢出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM