通用 IEqualityComparer<T> 和 GetHashCode

[英]Generic IEqualityComparer<T> and GetHashCode

Being somewhat lazy about implementing lots of IEqualityComparers, and given that I couldn't easily edit class implementations of object being compared, I went with the following, meant to be used with Distinct() and Except() extension methods.对于实现大量 IEqualityComparers 有点懒惰,并且考虑到我无法轻松编辑正在比较的对象的类实现,我选择了以下内容,旨在与 Distinct() 和 Except() 扩展方法一起使用。 :

public class GenericEqualityComparer<T> : IEqualityComparer<T>
    Func<T, T, bool> compareFunction;
    Func<T, int> hashFunction;

    public GenericEqualityComparer(Func<T, T, bool> compareFunction, Func<T, int> hashFunction)
        this.compareFunction = compareFunction;
        this.hashFunction = hashFunction;

    public bool Equals(T x, T y)
        return compareFunction(x, y);

    public int GetHashCode(T obj)
        return hashFunction(obj);

Seems nice, but is giving an hash function everytimes REALLY necessary ?看起来不错,但是每次都真的需要提供一个散列函数吗? I understand that the hashcode is used to put objects in buckets.我知道哈希码用于将对象放入桶中。 Different buckets, object are not equal, and equal is not called.不同的bucket,object不相等,不调用equal。

If GetHashCode returns the same value, equals is called.如果 GetHashCode 返回相同的值,则调用 equals。 ( from : Why is it important to override GetHashCode when Equals method is overridden? ) (来自: 为什么在覆盖 Equals 方法时覆盖 GetHashCode 很重要?

So what could go wrong, if for example (and I hear a lot of programmers screaming in horror), GetHashCode returns a constant, to force the call to Equal ?那么会出现什么问题,例如(我听到很多程序员惊恐地尖叫),GetHashCode 返回一个常量,强制调用 Equal ?

Nothing would go wrong, but in hash-table based containers, you're going from approx O(1) to O(n) performance when doing a lookup.什么都不会出错,但是在基于哈希表的容器中,在进行查找时,性能大约为 O(1) 到 O(n)。 You'd be better off simply storing everything in a List and brute force searching it for items that fulfil equality.您最好简单地将所有内容存储在一个 List 中,然后强力搜索它以查找满足相等性的项目。

If a common use-case is comparing objects according to one of their properties, you could add an additional constructor and implement and call it like this:如果一个常见的用例是根据对象的一个​​属性比较对象,您可以添加一个额外的构造函数并像这样实现和调用它:

public GenericEqualityComparer(Func<T, object> projection)
    compareFunction = (t1, t2) => projection(t1).Equals(projection(t2));
    hashFunction = t => projection(t).GetHashCode();

var comaparer = new GenericEqualityComparer( o => o.PropertyToCompare);

This will automatically use the hash as implemented by the property.这将自动使用由属性实现的哈希。

EDIT: a more efficient and robust implementation inspired my Marc's comments:编辑:更有效和更强大的实现激发了我的 Marc 的评论:

public static GenericEqualityComparer<T> Create<TValue>(Func<T, TValue> projection)
    return new GenericEqualityComparer<T>(
        (t1, t2) => EqualityComparer<TValue>.Default.Equals( projection(t1), projection(t2)),
        t => EqualityComparer<TValue>.Default.GetHashCode(projection(t)));

var comparer = GenericEqualityComparer<YourObjectType>.Create( o => o.PropertyToCompare); 

Found this one on CodeProject - A Generic IEqualityComparer for Linq Distinct() nicely done.在 CodeProject - A Generic IEqualityComparer for Linq Distinct()上找到了这个,做得很好。

Use case:用例:

IEqualityComparer<Contact> c =  new PropertyComparer<Contact>("Name");
IEnumerable<Contact> distinctEmails = collection.Distinct(c); 

Generic IEqualityComparer通用 IEqualityComparer

public class PropertyComparer<T> : IEqualityComparer<T>
    private PropertyInfo _PropertyInfo;

    /// <summary>
    /// Creates a new instance of PropertyComparer.
    /// </summary>
    /// <param name="propertyName">The name of the property on type T 
    /// to perform the comparison on.</param>
    public PropertyComparer(string propertyName)
        //store a reference to the property info object for use during the comparison
        _PropertyInfo = typeof(T).GetProperty(propertyName, 
    BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
        if (_PropertyInfo == null)
            throw new ArgumentException(string.Format("{0} 
        is not a property of type {1}.", propertyName, typeof(T)));

    #region IEqualityComparer<T> Members

    public bool Equals(T x, T y)
        //get the current value of the comparison property of x and of y
        object xValue = _PropertyInfo.GetValue(x, null);
        object yValue = _PropertyInfo.GetValue(y, null);

        //if the xValue is null then we consider them equal if and only if yValue is null
        if (xValue == null)
            return yValue == null;

        //use the default comparer for whatever type the comparison property is.
        return xValue.Equals(yValue);

    public int GetHashCode(T obj)
        //get the value of the comparison property out of obj
        object propertyValue = _PropertyInfo.GetValue(obj, null);

        if (propertyValue == null)
            return 0;

            return propertyValue.GetHashCode();


Your performance will go down the drain.你的表现会付诸东流。 Distinct and Except are efficient operations when implemented on set data structures.当在集合数据结构上实现时, DistinctExcept是有效的操作。 By providing a constant hash value you essentially destroy this characteristic and force naive algorithm using a linear search.通过提供恒定的哈希值,您基本上可以破坏此特征并使用线性搜索强制执行朴素算法。

You need to see whether this is acceptable for your data volume.您需要查看这对于您的数据量是否可以接受。 But for somewhat larger data sets, the difference will be pronounced.但是对于稍大的数据集,差异会很明显。 For example, Except will increase from expected time O( n ) to O( n ²), which can be a big deal.例如, Except将预期时间为O(n)到为O(n²),它可以是一个大问题增加。

Rather than providing a constant, why not just call the object's own GetHashCode method?与其提供常量,不如直接调用对象自己的GetHashCode方法? It may not give a particularly good value but it cannot be worse than using a constant, and correctness will still be preserved unless the GetHashCode method of the object is overridden to return wrong values.它可能不会给出一个特别好的值,但它不会比使用常量更糟糕,并且正确性仍将保留,除非对象的GetHashCode方法被覆盖以返回错误的值。

I needed to rewrite Henrik solution as a class implementing IEqualityComparer which gives this:我需要将 Henrik 解决方案重写为一个实现IEqualityComparer的类,它给出了这个:

    public class GenericEqualityComparer<T,TKey> : IEqualityComparer<T>
        private readonly Func<T, TKey> _keyFunction;

        public GenericEqualityComparer(Func<T, TKey> keyFunction)
            _keyFunction = keyFunction;

        public bool Equals(T x, T y) => EqualityComparer<TKey>.Default.Equals(_keyFunction(x), _keyFunction(y));

        public int GetHashCode(T obj)=> EqualityComparer<TKey>.Default.GetHashCode(_keyFunction(obj));

Try this code:试试这个代码:

public class GenericCompare<T> : IEqualityComparer<T> where T : class
    private Func<T, object> _expr { get; set; }
    public GenericCompare(Func<T, object> expr)
        this._expr = expr;
    public bool Equals(T x, T y)
        var first = _expr.Invoke(x);
        var sec = _expr.Invoke(y);
        if (first != null && first.Equals(sec))
            return true;
            return false;
    public int GetHashCode(T obj)
        return obj.GetHashCode();

Example: collection = collection.Except(ExistedDataEles, new GenericCompare(x=>x.Id)).ToList();示例: collection = collection.Except(ExistedDataEles, new GenericCompare(x=>x.Id)).ToList();

