[英]How does bit shifting values in this GetHashCode method improve hashing?
I found two data types with hash code methods in a code base I'm working on, which I don't fully understand why they were chosen: 我在正在使用的代码库中找到了两种带有哈希码方法的数据类型,但我不完全理解为什么选择它们:
public override int GetHashCode()
{
return x.GetHashCode() ^ y.GetHashCode() << 2;
}
public override int GetHashCode()
{
return x.GetHashCode() ^ y.GetHashCode() << 2 ^ z.GetHashCode() >> 2;
}
How does the bit shifting operation make these hash values any better? 移位操作如何使这些哈希值更好?
Let's say you have a Point
data structure represented by a x
and y
variable. 假设您有一个由
x
和y
变量表示的Point
数据结构。 Without the bit shifting the value of hash code for (1,0)
would be 1
, and the hash code for (0,1)
would also be 1
. 如果不进行位移,则
(1,0)
的哈希码的值为1
,而(0,1)
的哈希码的1
。 Now do the same thing with the bit shifting, for (1,0)
we get a hash code of 1
, but for (0,1)
we now get a hash code of 4
现在做同样的事情与比特移位,为
(1,0)
我们得到的哈希码1
,但对(0,1)
我们现在得到的哈希码4
What the bit shifting provides is if you have the same inputs but in a different order you want to get different hash codes, that way (1,0)
and (0,1)
don't end up falling in to the same hash bucket and degrading your hashset/dictionary performance. 移位提供的是,如果您具有相同的输入,但您想以不同的顺序获得不同的哈希码,则
(1,0)
和(0,1)
不会最终落入同一哈希桶中并降低您的哈希集/字典性能。
Normally you would do a much larger offset than just left shifting twice. 通常情况下,偏移量会比左移两次大得多。 Bitshifting also causes data to get truncated if dealing with hash codes near
Int32.MaxValue
. 如果处理
Int32.MaxValue
附近的哈希码,则位移位还会导致数据被截断。 Here is the pattern I normally use 这是我通常使用的模式
public override int GetHashCode()
{
unchecked
{
var hashCode = X;
hashCode = (hashCode*397) ^ Y;
hashCode = (hashCode*397) ^ Z;
return hashCode;
}
}
(this is the default implementation that comes with Resharper's "Insert Comparison Method" feature. To add more fields you just keep doing hashCode = (hashCode*397) ^ XXXXXXX
) (这是Resharper的“插入比较方法”功能随附的默认实现。要添加更多字段,您只需继续执行
hashCode = (hashCode*397) ^ XXXXXXX
)
By using *
with unchecked
instead of <<
any value that is larger than Int32.MaxValue
just overflows without error. 通过将
*
与unchecked
而不是<<
一起使用,任何大于Int32.MaxValue
值都将Int32.MaxValue
溢出。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.