简体   繁体   English

是什么原因导致GetHashCode的实现比.net的实现慢20倍?

[英]What is causing this implementation of GetHashCode to be 20 times slower than .net's implementation?

I got the idea of a Substring struct from this post and this one. 我有一个字符串的想法struct这个职位和一个。 The second post has the implementation of .net's String.GetHashCode(). 第二篇文章实现了.net的String.GetHashCode()。 (I'm not sure which version of .net this is from.) (我不确定.net的版本来自哪个版本。)

Here is the implementation. 这是实施。 (GetHashCode is taken from the second source listed above.) (GetHashCode取自上面列出的第二个来源。)

public struct Substring
{
    private string String;
    private int Offset;
    public int Length { get; private set; }
    public char this[int index] { get { return String[Offset + index]; } }

    public Substring(string str, int offset, int len) : this()
    {
        String = str;
        Offset = offset;
        Length = len;
    }

    /// <summary>
    /// See http://www.dotnetperls.com/gethashcode
    /// </summary>
    /// <returns></returns>
    public unsafe override int GetHashCode()
    {
        fixed (char* str = String + Offset)
        {
            char* chPtr = str;
            int num = 352654597;
            int num2 = num;
            int* numPtr = (int*)chPtr;
            for (int i = Length; i > 0; i -= 4)
            {
                num = (((num << 5) + num) + (num >> 27)) ^ numPtr[0];
                if (i <= 2)
                {
                    break;
                }
                num2 = (((num2 << 5) + num2) + (num2 >> 27)) ^ numPtr[1];
                numPtr += 2;
            }
            return (num + (num2 * 1566083941));
        }
    }
}

Here's a unit test: 这是一个单元测试:

    [Test]
    public void GetHashCode_IsAsFastAsString()
    {
        var s = "The quick brown fox";
        var sub = new Substring(s, 1, 5);
        var t = "quick";
        var sum = 0;

        sum += sub.GetHashCode(); // make sure GetHashCode is jitted 

        var count = 100000000;
        var sw = Stopwatch.StartNew();
        for (var i = 0; i < count; ++i)
            sum += t.GetHashCode();
        var t1 = sw.Elapsed;
        sw = Stopwatch.StartNew();
        for (var i = 0; i < count; ++i)
            sum += sub.GetHashCode();
        var t2 = sw.Elapsed;

        Debug.WriteLine(sum.ToString()); // make sure we use the return value
        var m1 = t1.Milliseconds;
        var m2 = t2.Milliseconds;
        Assert.IsTrue(m2 <= m1); // fat chance
    }

The problem is that m1 is 10 milliseconds and m2 is 190 milliseconds. 问题是m1是10毫秒,m2是190毫秒。 (Note: this is with 1000000 iterations.) FYI, I ran this on .net 4.5 64 bit Release build with Optimizations turned on. (注意:这是1000000次迭代。)仅供参考。我在.net 4.5 64位发布版本上运行此功能,并启用了优化功能。

  1. Clued by a comment, I double checked to make sure that optimized code was running. 通过评论说,我仔细检查以确保优化的代码正在运行。 It turns out that an obscure Debugger setting was disabling optimizations. 事实证明,一个模糊的调试器设置正在禁用优化。 So I unchecked Tools – Options – Debugging – General – Suppress JIT optimization on module load (Managed only). 所以我取消选中工具 - 选项 - 调试 - 常规 - 抑制模块加载时的JIT优化(仅限管理)。 This caused optimized code to load properly. 这导致优化的代码正确加载。
  2. Even with optimizations turned on there is still a about a 3x - 6x difference. 即使启用了优化,仍然存在大约3倍到6倍的差异。 However, this might be attributable to the fact that the code above is the .net 32 bit version and I'm running 64 bit .net. 但是,这可能是因为上面的代码是.net 32​​位版本而我正在运行64位.net。 Porting the 64 bit implementation of string.GetHashCode to Substring is not as easy because it relies on a zero end of string marker (which is in fact a bug ). 将string.GetHashCode的64位实现移植到Substring并不容易,因为它依赖于字符串标记的零端(实际上是一个bug )。

At this time I'm disappointed about not getting parity performance, but this was an excellent use of my time in learning about some of the perils and pitfalls of optimizing C#. 在这个时候,我对没有获得平价表现感到失望,但这是我在学习优化C#的一些危险和陷阱方面的极好用处。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM