简体   繁体   English

C#GetHashCode实现

[英]C# GetHashCode Implementation

Is

public override int GetHashCode()
{
    return Word.GetHashCode();
}

Really the same to 真的一样

public override int GetHashCode()
{
    return (int) Word.GetHashCode() * 7;
}

regarding the uniqueness? 关于独特性?

Word is of type String WordString类型

EDIT: I forgot to say, which one is better to implement in the program, Option 1 or 2? 编辑:我忘了说,哪个更好地在程序中实现,选项1或2?

It is clear that any collisions in the Word.GetHashCode() implementation would result in a collision in (int) Word.GetHashCode() * 7 implementation, because multiplying identical numbers produce identical results. 显然, Word.GetHashCode()实现中的任何冲突都将导致(int) Word.GetHashCode() * 7实现中的冲突,因为将相同的数字相乘会产生相同的结果。

A more interesting question is if non-colliding hash codes from the first implementation would result in collisions within the second implementation. 一个更有趣的问题是,第一个实现中的非冲突哈希码是否会导致第二个实现中的冲突。 It turns out that the answer is "no", because the range of int and 7 are mutually prime numbers. 事实证明答案是“否”,因为int7的范围是互质数。 Hence, multiplication produces a unique mapping after dropping the overflow. 因此,乘法在消除溢出后会产生唯一的映射。

You can run a small test with two-byte hash codes to see what happens: 您可以使用两个字节的哈希码运行小型测试,以查看会发生什么:

const int Max = 1<<16;
var count = new int[Max];
for (int i = 0 ; i != Max ; i++) {
    count[(i * 7) & (Max-1)]++;
}
var notOne = 0;
for (int i = 0 ; i != Max ; i++) {
    if (count[i] != 1) {
        notOne++;
    }
}
Console.WriteLine("Count of duplicate mappings found: {0}", notOne);

This program maps i , the hash code value, to 7 * i modulo 2 16 , and verifies that each number from the range is produced exactly once. 该程序将哈希码值i映射到7 * i 2 16为模,并验证该范围内的每个数字是否恰好产生一次。

Count of duplicate mappings found: 0

Demo. 演示。

If you replace 7 with an even number, the result is going to be very different. 如果将7替换为偶数,结果将大不相同。 Now multiple hash codes from the original set would be mapped to a single hash code in the target set. 现在,原始集中的多个哈希码将被映射到目标集中的单个哈希码。 You can understand this intuitively if you recall that multiplying by an even number always makes the least significant bit to be zero. 如果您记得乘以偶数始终会使最低有效位为零,则可以直观地理解这一点。 Hence, some information is lost, depending on how many times the even number can be divided by two. 因此,根据偶数可以除以2的次数,一些信息会丢失。

which one is better? 哪一个更好?

There is no difference. 没有区别。

Note: The above assumes that you are ignoring integer overflow. 注意:以上假设您忽略整数溢出。

由于您不是在unchecked上下文中运行代码,因此后者会在发生溢出的任何时间引发异常,这很可能会发生(哈希范围的6/7会抛出,因此通常均匀分布的哈希代码具有约有6/7的机会抛出异常)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM