简体   繁体   中英

Generic IEqualityComparer<T> and GetHashCode

Being somewhat lazy about implementing lots of IEqualityComparers, and given that I couldn't easily edit class implementations of object being compared, I went with the following, meant to be used with Distinct() and Except() extension methods. :

public class GenericEqualityComparer<T> : IEqualityComparer<T>
{
    Func<T, T, bool> compareFunction;
    Func<T, int> hashFunction;

    public GenericEqualityComparer(Func<T, T, bool> compareFunction, Func<T, int> hashFunction)
    {
        this.compareFunction = compareFunction;
        this.hashFunction = hashFunction;
    }

    public bool Equals(T x, T y)
    {
        return compareFunction(x, y);
    }

    public int GetHashCode(T obj)
    {
        return hashFunction(obj);
    }
}

Seems nice, but is giving an hash function everytimes REALLY necessary ? I understand that the hashcode is used to put objects in buckets. Different buckets, object are not equal, and equal is not called.

If GetHashCode returns the same value, equals is called. ( from : Why is it important to override GetHashCode when Equals method is overridden? )

So what could go wrong, if for example (and I hear a lot of programmers screaming in horror), GetHashCode returns a constant, to force the call to Equal ?

Nothing would go wrong, but in hash-table based containers, you're going from approx O(1) to O(n) performance when doing a lookup. You'd be better off simply storing everything in a List and brute force searching it for items that fulfil equality.

If a common use-case is comparing objects according to one of their properties, you could add an additional constructor and implement and call it like this:

public GenericEqualityComparer(Func<T, object> projection)
{
    compareFunction = (t1, t2) => projection(t1).Equals(projection(t2));
    hashFunction = t => projection(t).GetHashCode();
}

var comaparer = new GenericEqualityComparer( o => o.PropertyToCompare);

This will automatically use the hash as implemented by the property.

EDIT: a more efficient and robust implementation inspired my Marc's comments:

public static GenericEqualityComparer<T> Create<TValue>(Func<T, TValue> projection)
{
    return new GenericEqualityComparer<T>(
        (t1, t2) => EqualityComparer<TValue>.Default.Equals( projection(t1), projection(t2)),
        t => EqualityComparer<TValue>.Default.GetHashCode(projection(t)));
}

var comparer = GenericEqualityComparer<YourObjectType>.Create( o => o.PropertyToCompare); 

Found this one on CodeProject - A Generic IEqualityComparer for Linq Distinct() nicely done.

Use case:

IEqualityComparer<Contact> c =  new PropertyComparer<Contact>("Name");
IEnumerable<Contact> distinctEmails = collection.Distinct(c); 

Generic IEqualityComparer

public class PropertyComparer<T> : IEqualityComparer<T>
{
    private PropertyInfo _PropertyInfo;

    /// <summary>
    /// Creates a new instance of PropertyComparer.
    /// </summary>
    /// <param name="propertyName">The name of the property on type T 
    /// to perform the comparison on.</param>
    public PropertyComparer(string propertyName)
    {
        //store a reference to the property info object for use during the comparison
        _PropertyInfo = typeof(T).GetProperty(propertyName, 
    BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
        if (_PropertyInfo == null)
        {
            throw new ArgumentException(string.Format("{0} 
        is not a property of type {1}.", propertyName, typeof(T)));
        }
    }

    #region IEqualityComparer<T> Members

    public bool Equals(T x, T y)
    {
        //get the current value of the comparison property of x and of y
        object xValue = _PropertyInfo.GetValue(x, null);
        object yValue = _PropertyInfo.GetValue(y, null);

        //if the xValue is null then we consider them equal if and only if yValue is null
        if (xValue == null)
            return yValue == null;

        //use the default comparer for whatever type the comparison property is.
        return xValue.Equals(yValue);
    }

    public int GetHashCode(T obj)
    {
        //get the value of the comparison property out of obj
        object propertyValue = _PropertyInfo.GetValue(obj, null);

        if (propertyValue == null)
            return 0;

        else
            return propertyValue.GetHashCode();
    }

    #endregion
}  

Your performance will go down the drain. Distinct and Except are efficient operations when implemented on set data structures. By providing a constant hash value you essentially destroy this characteristic and force naive algorithm using a linear search.

You need to see whether this is acceptable for your data volume. But for somewhat larger data sets, the difference will be pronounced. For example, Except will increase from expected time O( n ) to O( n ²), which can be a big deal.

Rather than providing a constant, why not just call the object's own GetHashCode method? It may not give a particularly good value but it cannot be worse than using a constant, and correctness will still be preserved unless the GetHashCode method of the object is overridden to return wrong values.

I needed to rewrite Henrik solution as a class implementing IEqualityComparer which gives this:

    public class GenericEqualityComparer<T,TKey> : IEqualityComparer<T>
    {
        private readonly Func<T, TKey> _keyFunction;

        public GenericEqualityComparer(Func<T, TKey> keyFunction)
        {
            _keyFunction = keyFunction;
        }

        public bool Equals(T x, T y) => EqualityComparer<TKey>.Default.Equals(_keyFunction(x), _keyFunction(y));

        public int GetHashCode(T obj)=> EqualityComparer<TKey>.Default.GetHashCode(_keyFunction(obj));
    }

Try this code:

public class GenericCompare<T> : IEqualityComparer<T> where T : class
{
    private Func<T, object> _expr { get; set; }
    public GenericCompare(Func<T, object> expr)
    {
        this._expr = expr;
    }
    public bool Equals(T x, T y)
    {
        var first = _expr.Invoke(x);
        var sec = _expr.Invoke(y);
        if (first != null && first.Equals(sec))
            return true;
        else
            return false;
    }
    public int GetHashCode(T obj)
    {
        return obj.GetHashCode();
    }
}

Example: collection = collection.Except(ExistedDataEles, new GenericCompare(x=>x.Id)).ToList();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM