简体   繁体   English

复杂对象图的快速HashCode

[英]Fast HashCode of a Complex Object Graph

I have a pretty complex object and I need to get uniqueness of these objects. 我有一个非常复杂的对象 ,我需要获得这些对象的唯一性 One solution can be done by overriding GetHashCode() . 可以通过重写GetHashCode()来完成一个解决方案。 I have implemented a code noted below: 我已经实现了以下代码:

public override int GetHashCode()
{
    return this._complexObject1.GetHashCode() ^
           this._complexObject2.GetHashCode() ^
           this._complexObject3.GetHashCode() ^
           this._complexObject4.GetHashCode() ^
           this._complexObject5.GetHashCode() ^
           this._complexObject6.GetHashCode() ^
           this._complexObject7.GetHashCode() ^
           this._complexObject8.GetHashCode();
}

These complex objects also overrides GetHashCode() and does similar operations . 这些复杂对象也会覆盖 GetHashCode()并执行类似的操作

My project requires uniqueness of these objects which I deal with these very frequently , and data inside also changes in various ways and places . 我的项目需要这些对象的唯一性,我经常处理这些对象,并且内部数据也会以各种方式和位置 发生变化

I need a faster way to find uniqueness of these complex objects, which need to consider performance and memory . 我需要一种更快的方法来找到这些复杂对象的唯一性,这需要考虑性能内存

Thanks in advance 提前致谢
Munim 穆奈姆

Given your comment, it sounds like you may be trying to rely on GetHashCode on its own to determine uniqueness. 鉴于你的评论,这听起来像你可能试图依靠的GetHashCode 自行确定唯一性。 Don't do that. 不要那样做。 Hashes aren't meant to be unique - it's meant to be unlikely that two unequal objects will hash to the same value, but not impossible. 散列并不意味着是独一无二的-它的意思是不可能的,两个不相等的对象将返回相同的数值,但并非不可能。 If you're trying to check that a set of objects has no duplicates, you will have to use Equals as well. 如果您尝试检查一组对象没有重复项,则必须使用Equals。

Note that using XOR for a hashcode can make it more likely that you'll get hash collisions, depending on the individual hash values involved. 请注意,使用XOR作为哈希码可能会使您更有可能获得哈希冲突,具体取决于所涉及的各个哈希值。 In particular, it makes any two equal fields "cancel each other out". 特别是,它使任何两个相等的字段“相互抵消”。 I generally use this form: 我通常使用这种形式:

int hash = 17;
hash = hash * 31 + field1.GetHashCode();
hash = hash * 31 + field2.GetHashCode();
hash = hash * 31 + field3.GetHashCode();
hash = hash * 31 + field4.GetHashCode();
...
return hash;

... but even so, that's certainly not going to guarantee uniqueness. ......但即便如此,这肯定不能保证唯一性。 You should use GetHashCode() to rule out equality, and then use Equals to check the actual equality of any potentially equal values. 您应该使用GetHashCode()排除平等,然后使用Equals来检查任何可能相等值的实际平等。

Now your question mentions speed - this sounds like the perfect place to use a profiler and some benchmark tests. 现在你的问题提到了速度 - 这听起来像是使用剖析器和一些基准测试的完美场所。 Are you sure this is a bottleneck? 你确定这是一个瓶颈吗? If you have many different types all computing hash values, have you found out which of these is the biggest contributor to the problem? 如果您有许多不同类型的所有计算哈希值,您是否发现其中哪一个是问题的最大贡献者?

Some optimisations will depend on exactly how you use the data. 一些优化将取决于您使用数据的确切方式。 If you find that a lot of your time is spent recomputing hashes for values which you know haven't changed, you could cache the hash code... although this obviously becomes trickier when there are fields which themselves refer to complex objects. 如果你发现花费了大量的时间来重新计算你知道没有改变的值的哈希值,你可以缓存哈希码......虽然当有些字段本身引用复杂对象时,这显然变得更加棘手。 It's possible that you could cache "leaf node" hashes, particularly if those leaf nodes don't change often (but their usage could vary). 您可以缓存“叶节点”哈希,特别是如果这些叶节点不经常更改(但它们的使用可能会有所不同)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM