简体   繁体   English

GetHashCode重写包含泛型数组的对象

[英]GetHashCode override of object containing generic array

I have a class that contains the following two properties: 我有一个包含以下两个属性的类:

public int Id      { get; private set; }
public T[] Values  { get; private set; }

I have made it IEquatable<T> and overriden the object.Equals like this: 我已IEquatable<T>和重写所述object.Equals这样的:

public override bool Equals(object obj)
{
    return Equals(obj as SimpleTableRow<T>);
}

public bool Equals(SimpleTableRow<T> other)
{
    // Check for null
    if(ReferenceEquals(other, null))
        return false;

    // Check for same reference
    if(ReferenceEquals(this, other))
        return true;

    // Check for same Id and same Values
    return Id == other.Id && Values.SequenceEqual(other.Values);
}

When having override object.Equals I must also override GetHashCode of course. 当有覆盖object.Equals我当然也必须覆盖GetHashCode But what code should I implement? 但是我应该实现什么代码? How do I create a hashcode out of a generic array? 如何从通用数组中创建哈希码? And how do I combine it with the Id integer? 我如何将它与Id整数结合起来?

public override int GetHashCode()
{
    return // What?
}

Because of the problems raised in this thread, I'm posting another reply showing what happens if you get it wrong... mainly, that you can't use the array's GetHashCode() ; 由于这个帖子中出现的问题,我发布了另一个回复,显示如果你弄错了会发生什么......主要是你不能使用数组的GetHashCode() ; the correct behaviour is that no warnings are printed when you run it... switch the comments to fix it: 正确的行为是,当你运行它时没有打印警告...切换注释以修复它:

using System;
using System.Collections.Generic;
using System.Linq;
static class Program
{
    static void Main()
    {
        // first and second are logically equivalent
        SimpleTableRow<int> first = new SimpleTableRow<int>(1, 2, 3, 4, 5, 6),
            second = new SimpleTableRow<int>(1, 2, 3, 4, 5, 6);

        if (first.Equals(second) && first.GetHashCode() != second.GetHashCode())
        { // proven Equals, but GetHashCode() disagrees
            Console.WriteLine("We have a problem");
        }
        HashSet<SimpleTableRow<int>> set = new HashSet<SimpleTableRow<int>>();
        set.Add(first);
        set.Add(second);
        // which confuses anything that uses hash algorithms
        if (set.Count != 1) Console.WriteLine("Yup, very bad indeed");
    }
}
class SimpleTableRow<T> : IEquatable<SimpleTableRow<T>>
{

    public SimpleTableRow(int id, params T[] values) {
        this.Id = id;
        this.Values = values;
    }
    public int Id { get; private set; }
    public T[] Values { get; private set; }

    public override int GetHashCode() // wrong
    {
        return Id.GetHashCode() ^ Values.GetHashCode();
    }
    /*
    public override int GetHashCode() // right
    {
        int hash = Id;
        if (Values != null)
        {
            hash = (hash * 17) + Values.Length;
            foreach (T t in Values)
            {
                hash *= 17;
                if (t != null) hash = hash + t.GetHashCode();
            }
        }
        return hash;
    }
    */
    public override bool Equals(object obj)
    {
        return Equals(obj as SimpleTableRow<T>);
    }
    public bool Equals(SimpleTableRow<T> other)
    {
        // Check for null
        if (ReferenceEquals(other, null))
            return false;

        // Check for same reference
        if (ReferenceEquals(this, other))
            return true;

        // Check for same Id and same Values
        return Id == other.Id && Values.SequenceEqual(other.Values);
    }
}

FWIW, it's very dangerous to use the contents of the Values in your hash code. FWIW,在哈希码中使用Values的内容非常危险。 You should only do this if you can guarantee that it will never change. 如果您能保证永远不会改变,那么您应该这样做。 However, since it is exposed, I don't think guaranteeing it is possible. 但是,由于它暴露,我不认为保证它是可能的。 The hashcode of an object should never change. 对象的哈希码永远不会改变。 Otherwise, it loses its value as a key in a Hashtable or Dictionary. 否则,它将作为Hashtable或Dictionary中的键丢失其值。 Consider the hard-to-find bug of using an object as a key in a Hashtable, its hashcode changes because of an outside influence and you can no longer find it in the Hashtable! 考虑使用对象作为Hashtable中的键的难以发现的错误,其哈希码因外部影响而发生变化,您无法再在Hashtable中找到它!

由于hashCode有点存储对象的密钥(lleeke在哈希表中),我只使用Id.GetHashCode()

How about something like: 怎么样的:

    public override int GetHashCode()
    {
        int hash = Id;
        if (Values != null)
        {
            hash = (hash * 17) + Values.Length;
            foreach (T t in Values)
            {
                hash *= 17;
                if (t != null) hash = hash + t.GetHashCode();
            }
        }
        return hash;
    }

This should be compatible with SequenceEqual , rather than doing a reference comparison on the array. 这应该与SequenceEqual兼容,而不是在数组上进行参考比较。

I just had to add another answer because one of the more obvious (and easiest to implement) solutions were not mentioned - not including the collection in your GetHashCode calculation! 我只需要添加另一个答案,因为没有提到一个更明显(并且最容易实现)的解决方案 - 不包括GetHashCode计算中的集合!

The main thing that seemed to have forgotten here is that the uniqueness from the result of GetHashCode isn't required (or in many cases even possible). 这里似乎忘记的主要事情是GetHashCode结果的唯一性不是必需的(或者在许多情况下甚至可能)。 Unequal objects don't have to return unequal hash codes, the only requirement is that equal objects return equal hash codes. 不等的对象不必返回不等的哈希码,唯一的要求是等对象返回相等的哈希码。 So by that definition, the following implementation of GetHashCode is correct for all objects (assuming there's a correct Equals implementation): 因此,根据该定义, GetHashCode的以下实现对于所有对象都是正确的(假设有正确的Equals实现):

public override int GetHashCode() 
{ 
    return 42; 
} 

Of course this would yield the worst possible performance in hashtable lookup, O(n) instead of O(1), but it is still functionally correct. 当然,这将在哈希表查找中产生最差的性能,O(n)而不是O(1),但它仍然在功能上是正确的。

With that in mind, my general recommendation when implementing GetHashCode for an object that happens to have any kind of collection as one or more of its members is to simply ignore them and calculate GetHashCode solely based on the other scalar members. 考虑到这一点,我在为一个碰巧拥有任何类型集合作为其一个或多个成员的对象实现GetHashCode时的一般建议是简单地忽略它们并仅基于其他标量成员计算GetHashCode This would work pretty well except if you put into a hash table a huge number of objects where all their scalar members have identical values, resulting in identical hash codes. 这可以很好地工作,除非你在哈希表中放入大量的对象,其中所有的标量成员具有相同的值,从而产生相同的哈希码。

Ignoring collection members when calculating the hash code can also yield a performance improvement, despite the decreased distribution of the hash code values. 尽管哈希码值的分布减少,但在计算哈希码时忽略收集成员也可以产生性能改进。 Remember that using a hash code is supposed to improve performance in a hash table by not requiring to call Equals N times, and instead will only require calling GetHashCode once and a quick hash table lookup. 请记住,使用散列码应该通过不需要调用,以提高哈希表的性能Equals N次,而是只需要调用GetHashCode的一次快速的哈希表查找。 If each object has an inner array with 10,000 items which all participate in the calculation of the hash code, any benefits gained by the good distribution would probably be lost. 如果每个对象都有一个包含10,000个项目的内部数组,这些项目都参与哈希码的计算,那么良好分布所带来的任何好处都可能会丢失。 It would be better to have a marginally less distributed hash code if generating it is considerably less costly. 如果生成它的成本要低得多,那么使用稍微分散的哈希代码会更好。

public override int GetHashCode() {
   return Id.GetHashCode() ^ Values.GetHashCode();  
}

There are several good points in the comments and other answers. 评论和其他答案有几个好处。 The OP should consider whether the Values would be used as part of the "key" if the object were used as a key in a dictionary. 如果对象用作字典中的键,则OP应考虑值是否将用作“键”的一部分。 If so, then they should be part of the hash code, otherwise, not. 如果是这样,那么它们应该是哈希码的一部分,否则就不是。

On the other hand, I'm not sure why the GetHashCode method should mirror SequenceEqual. 另一方面,我不确定为什么GetHashCode方法应该镜像SequenceEqual。 It's meant to compute an index into a hash table, not to be the complete determinant of equality. 它意味着计算哈希表的索引,而不是完全相等的决定因素。 If there are many hash table collisions using the algorithm above, and if they differ in the sequence of the Values, then an algorithm should be chosen that takes sequence into account. 如果使用上述算法存在许多哈希表冲突,并且如果它们在值的序列中不同,则应选择考虑序列的算法。 If sequence doesn't really matter, save the time and don't take it into account. 如果顺序并不重要,请节省时间,不要将其考虑在内。

I know this thread is pretty old, but I wrote this method to allow me to calculate hashcodes of multiple objects. 我知道这个线程已经很老了,但是我写了这个方法来允许我计算多个对象的哈希码。 It's been very helpful for this very case. 对于这种情况,它非常有用。 It's not perfect, but it does meet my needs and most likely yours too. 它并不完美,但它确实满足了我的需求,而且很可能也符合你的需求。

I can't really take any credit for it. 我真的不能相信它。 I got the concept from some of the .net gethashcode implementations. 我从一些.net gethashcode实现中得到了这个概念。 I'm using 419 (afterall, it's my favorite large prime), but you can choose just about any reasonable prime (not too small . . . not too large). 我正在使用419(毕竟,这是我最喜欢的大素数),但你可以选择任何合理的素数(不是太小......不是太大)。

So, here's how I get my hashcodes: 所以,这是我如何得到我的哈希码:

using System.Collections.Generic;
using System.Linq;

public static class HashCodeCalculator
{
    public static int CalculateHashCode(params object[] args)
    {
        return args.CalculateHashCode();
    }

    public static int CalculateHashCode(this IEnumerable<object> args)
    {
        if (args == null)
            return new object().GetHashCode();

        unchecked
        {
            return args.Aggregate(0, (current, next) => (current*419) ^ (next ?? new object()).GetHashCode());
        }
    }
}

I would do it this way: 我会这样做:

long result = Id.GetHashCode();
foreach(T val in Values)
    result ^= val.GetHashCode();
return result;

Provided that Id and Values will never change, and Values is not null... 假设Id和Values永远不会改变,并且Values不为null ...

public override int GetHashCode()
{
  return Id ^ Values.GetHashCode();
}

Note that your class is not immutable, since anyone can modify the contents of Values because it is an array. 请注意,您的类不是不可变的,因为任何人都可以修改Values的内容,因为它是一个数组。 Given that, I wouldn't try to generate a hashcode using its contents. 鉴于此,我不会尝试使用其内容生成哈希码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM