简体   繁体   中英

Class to calculate hash codes in GetHashCode

I use an XOR based implementation in the GetHashCode implementation of most of my equatable types.

I've read several posts explaining why it is not the best solution so I decided to implement GetHashCode as suggested by Jon Skeet :

unchecked // Overflow is fine, just wrap
{
    int hash = 17;

    hash = hash * 23 + field1.GetHashCode();
    hash = hash * 23 + field2.GetHashCode();
    hash = hash * 23 + field3.GetHashCode();

    return hash;
}

Since the code is likely to be similar in most implementations, I tried to build a helper class to calculate hash codes for all my classes. It should be an easy thing to do but one of the main constraints with GetHashCode is it has to be fast. Therefore any implementation involving allocation is probably a no go (for instance, the use of a non static class).

Ideally a call to such a method would look like:

public override GetHashCode() => HashCodeCalculator.Calculate(X, Y, Z);

And have all the logic (unchecked + primes + null check...). But the use of a params parameter implicitly creates an array.

Is it best to duplicate the hashing algorithm in each class instead? Or is a class like the following as efficient?

public static class HashCalculator
{
    private const int _seed = 5923;
    private const int _multiplier = 7481;

    public static int Add(object value) => Add(_seed, value);

    public static int Add(int current, object value)
    {
        int valueHashCode = (value != null) ? value.GetHashCode() : 0;

        unchecked
        {
            return (current * _multiplier) + valueHashCode;
        }
    }
}

which can then be used like this:

public override int GetHashCode()
{
  int result = HashCalculator.Add(Prop1);
  result = HashCalculator.Add(result, Prop2);

  return result;
}

您可以为各种小的固定数量的参数(2、3、4 等,直到您决定停止)创建重载,以避免数组分配,然后有一个params重载,只有在出现时才需要使用是特别大量的操作数,此时数组分配的开销不太可能成为问题(因为它将是完成工作的较小百分比)。

I can see why it is so tempting to have some kind of helper tool to calc hashes, but in this case efficiency is in great contradiction to convenience. You are trying to have a cookie and eat it and the answer depends on how much cookie you are willing to left over :)

  • One additional method call? Then it should have signature simmilar to int HashCode(params int subhashcodes) but invoking it will be ugly because you need to provide hashcodes of fields as parameters.
  • One method call and boxing? Then you can change int to object in previous signature to call fields hashcodes inside your method (I'm not fully sure that there will be no boxing in first case - feel free to correct me)

Personally I will stick to writing it by hand (or by Resharper).

After benchmarking it appears that using a struct like the following is almost as efficient as XORing and nicely encapsulates hash codes calculation.

/// <summary>
/// Calculates a hash code based on multiple hash codes.
/// </summary>
public struct HashCode
{
    private const int _seed = 5923;
    private const int _multiplier = 7481;

    /// <summary>
    /// Builds a new hash code.
    /// </summary>
    /// <returns>The built hash code.</returns>
    public static HashCode Build() => new HashCode(_seed);

    /// <summary>
    /// Constructor from a hash value.
    /// </summary>
    /// <param name="value">Hash value.</param>
    private HashCode(int value)
    {
        _value = value;
    }

    /// <summary>
    /// Builds a new hash code and initializes it from a hash code source.
    /// </summary>
    /// <param name="hashCodeSource">Item from which a hash code can be extracted (using GetHashCode).</param>
    public HashCode(object hashCodeSource)
    {
        int sourceHashCode = GetHashCode(hashCodeSource);
        _value = AddValue(_seed, sourceHashCode);
    }
    private readonly int _value;

    /// <summary>
    /// Returns the hash code for a given hash code source (0 if the source is null).
    /// </summary>
    /// <param name="hashCodeSource">Item from which a hash code can be extracted (using GetHashCode).</param>
    /// <returns>The hash code.</returns>
    private static int GetHashCode(object hashCodeSource) => (hashCodeSource != null) ? hashCodeSource.GetHashCode() : 0;

    /// <summary>
    /// Adds a new hash value to a hash code.
    /// </summary>
    /// <param name="currentValue">Current hash value.</param>
    /// <param name="valueToAdd">Value to add.</param>
    /// <returns>The new hash value.</returns>
    private static int AddValue(int currentValue, int valueToAdd)
    {
        unchecked
        {
            return (currentValue * _multiplier) + valueToAdd;
        }
    }

    /// <summary>
    /// Adds an object's hash code.
    /// </summary>
    /// <param name="hashCode">Hash code to which the object's hash code has to be added.</param>
    /// <param name="hashCodeSource">Item from which a hash code can be extracted (using GetHashCode).</param>
    /// <returns>The updated hash instance.</returns>
    public static HashCode operator +(HashCode hashCode, object hashCodeSource)
    {
        int sourceHashCode = GetHashCode(hashCodeSource);
        int newHashValue = AddValue(hashCode._value, sourceHashCode);

        return new HashCode(newHashValue);
    }

    /// <summary>
    /// Implicit cast operator to int.
    /// </summary>
    /// <param name="hashCode">Hash code to convert.</param>
    public static implicit operator int(HashCode hashCode) => hashCode._value;
}

which can be used like this:

public override int GetHashCode() => new HashCode(Prop1) + Prop2;

EDIT: .net core now has such a HashCode struct .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM