简体   繁体   中英

GetHashCode returns the same value for different objects. Is there any method to identify object by particular properties?

I'm trying to create a hashcode method. I have code like below :

    private static object GetValue<T>(object item, string propertyName)
    {
        ParameterExpression arg = Expression.Parameter(item.GetType(), "x");
        Expression expr = Expression.Property(arg, propertyName);
        UnaryExpression unaryExpression = Expression.Convert(expr, typeof(object));
        var propertyResolver = Expression.Lambda<Func<T, object>>(unaryExpression, arg).Compile();
        return propertyResolver((T)item);
    }


    private static int GetHashCode<T>(T obj, List<string> columns)
    {
        unchecked
        {
            int hashCode = 17;

            for (var i = 0; i < columns.Count; i++)
            {
                object value = GetValue<T>(obj, columns[i]);
                var tempHashCode = value == null ? 0 : value.GetHashCode();
                hashCode = (hashCode * 23) + tempHashCode;
            }

            return hashCode;
        }
    }

    private static void TestHashCode()
    {
        var t1 = new { ID = (long)2044716, Type = "AE", Method = (short)1022, Index = 3 };
        var t2 = new { ID = (long)12114825, Type = "MEDAPE", Method = (short)1700, Index = 2 };

        var e1 = t1.GetHashCode();
        var e2 = t2.GetHashCode();

        var columns = new[] { "ID", "Type", "Method", "Index" }.ToList();
        var k1 = GetHashCode(t1, columns);
        var k2 = GetHashCode(t2, columns);
    }

The e1 value is -410666035, The e2 value is 101205027. The k1 value is 491329214. The k2 value is 491329214.

HashCode Steps:

hashCode = 17
tempHashCode = 2044716
hashcode = 2045107
tempHashCode = 1591023428
hashcode = 1638060889
tempHashCode = 66978814
hashcode = -912326403
tempHashCode = 3
hashcode = 491329214

How can k1 and k2 be the same value ? Because default .net gethashcode method gives two different values. I want to create a hashcode method that can get column list. I want to create a hash code by particular properties. I'm trying to get a unique value for object by particular properties.

How can I identify object by particular properties if GetHashCode doesn't guarantee unique value ?

I suspect the problem comes is caused by value.GetHashCode() in your GetHashCode<T> method. That value variable is an object there, I think GetHashCode() there is not returning what you would expect. Try to debug to find out what is happening.

You may want to try to keep your code, but instead of Object.GetHashCode() , use RuntimeHelpers.GetHashCode() (from namespace System.Runtime.CompilerServices ).

Full reference here: https://docs.microsoft.com/en-us/dotnet/api/system.runtime.compilerservices.runtimehelpers.gethashcode?redirectedfrom=MSDN&view=netframework-4.7.2#System_Runtime_CompilerServices_RuntimeHelpers_GetHashCode_System_Object_

Good luck!

GetHashCode returns a value that is implementation dependent. Its particular design is suitable for the "standard" use and is meaningful only during the life of an application. The default algorithm is not designed to avoid collisions.

The GetHashCode method is not designed to be unique for each instance.

Your approach relies on the composition of the hash of each column. An hash code has to satisfy certain requirements, for example the distribution in the domain. Though, is not guaranteed that the composition preserves such properties and requirements: the more columns you add the "stranger" the collisions could be.

Also, you are invoking value.GetHashCode() which hinders a boxing operation. As suggested by johey, you should use the RuntimeHelpers.GetHashCode() method because it interprets the object as value before computing the hash.

The .NET data structures are designed to handle collisions internally, for example, IDictionary uses the hash to select a bucket, and than scans sequentially the bucket.

I want to write here my solution. All of what said is true but not exactly. I want to collect topic here.

GetHashCode always gives the same value for object that are the same. The values of GetHashCode always may not belong to the different objects.

So the values of GetHashCode are compared firstly to improve performance, then go next step to compare objects if there are the same value of GetHashCode .

I created a IEqualityComparer.

private class CustomEqualityComparer<T> : IEqualityComparer<T>
    {

        private readonly List<string> _columns;
        private readonly bool _enableHashCode;
        private readonly ConcurrentDictionary<string, Func<T, object>> _cache;
        public CustomEqualityComparer(List<string> columns, ConcurrentDictionary<string, Func<T, object>> cache, bool enableHashCode = false)
        {
            _columns = columns;
            _enableHashCode = enableHashCode;
            _cache = cache;
        }

        public bool Equals(T x, T y)
        {
            for (var i = 0; i < _columns.Count; i++)
            {
                object value1 = GetValue(x, _columns[i], _cache);
                object value2 = GetValue(y, _columns[i], _cache);
                if (!value1.Equals(value2)) return false;
            }

            return true;
        }

        public int GetHashCode(T obj)
        {
            return _enableHashCode ? GetHashCode(obj, _columns, _cache) : 0;
        }

        private object GetValue(object item, string propertyName, ConcurrentDictionary<string, Func<T, object>> cache)
        {
            if (!cache.TryGetValue(propertyName, out Func<T, object> propertyResolver))
            {
                ParameterExpression arg = Expression.Parameter(item.GetType(), "x");
                Expression expr = Expression.Property(arg, propertyName);
                UnaryExpression unaryExpression = Expression.Convert(expr, typeof(object));
                propertyResolver = Expression.Lambda<Func<T, object>>(unaryExpression, arg).Compile();
                cache.TryAdd(propertyName, propertyResolver);
            }

            return propertyResolver((T)item);
        }

        private int GetHashCode(T obj, List<string> columns, ConcurrentDictionary<string, Func<T, object>> cache)
        {
            unchecked
            {
                var hashCode = 17;

                for (var i = 0; i < columns.Count; i++)
                {
                    object value = GetValue(obj, columns[i], cache);
                    var tempHashCode = value == null ? 0 : value.GetHashCode();
                    hashCode = hashCode * 23 + tempHashCode;
                }

                return hashCode;
            }
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM