简体   繁体   中英

How does bit shifting values in this GetHashCode method improve hashing?

I found two data types with hash code methods in a code base I'm working on, which I don't fully understand why they were chosen:

public override int GetHashCode()
{
    return x.GetHashCode() ^ y.GetHashCode() << 2;
}

public override int GetHashCode()
{
    return x.GetHashCode() ^ y.GetHashCode() << 2 ^ z.GetHashCode() >> 2;
}

How does the bit shifting operation make these hash values any better?

Let's say you have a Point data structure represented by a x and y variable. Without the bit shifting the value of hash code for (1,0) would be 1 , and the hash code for (0,1) would also be 1 . Now do the same thing with the bit shifting, for (1,0) we get a hash code of 1 , but for (0,1) we now get a hash code of 4

What the bit shifting provides is if you have the same inputs but in a different order you want to get different hash codes, that way (1,0) and (0,1) don't end up falling in to the same hash bucket and degrading your hashset/dictionary performance.

Normally you would do a much larger offset than just left shifting twice. Bitshifting also causes data to get truncated if dealing with hash codes near Int32.MaxValue . Here is the pattern I normally use

public override int GetHashCode()
{
    unchecked
    {
        var hashCode = X;
        hashCode = (hashCode*397) ^ Y;
        hashCode = (hashCode*397) ^ Z;
        return hashCode;
    }
}

(this is the default implementation that comes with Resharper's "Insert Comparison Method" feature. To add more fields you just keep doing hashCode = (hashCode*397) ^ XXXXXXX )

By using * with unchecked instead of << any value that is larger than Int32.MaxValue just overflows without error.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM