为什么虽然是引用类型，但使用ThreadLocal的Value成员的本地副本的速度更快？

Question

I'm following the example on page 107 of Patterns for Parallel Programming: Understanding and Applying Parallel Patterns with the .NET Framework 4 ( https://www.microsoft.com/en-us/download/details.aspx?id=19222 ). 我正在遵循第107页的“并行编程模式：通过.NET Framework 4理解和应用并行模式”中的示例（ https://www.microsoft.com/zh-cn/download/details.aspx?id=19222 ）。 It is stated that using a local copy of the ThreadLocal's Value member is faster compared to using Threadlocal.Value itself. 声明使用ThreadLocal的Value成员的本地副本比使用Threadlocal.Value本身要快。 I tested this and it is indeed the case. 我对此进行了测试，确实是这样。 But why? 但为什么？

As can be seen in the code, a local copy of _vector2.Value is saved in vector2 and this local copy is used to sum all items. 从代码中可以看到，_vector2.Value的本地副本保存在vector2中，并且此本地副本用于对所有项求和。 If you use _vector2.Value[i] += _vector1.Value[i] instead of vector2[i] += vector1[i] the code runs just as well albeit slower. 如果使用_vector2.Value[i] += _vector1.Value[i]而不是vector2[i] += vector1[i]则代码的运行速度也一样慢。 This is what is stated in the article. 这就是本文所述的内容。 Now int[] is a reference type. 现在，int []是引用类型。 This means that when you make a copy in vector2 you are actually copying a reference the original int[] in ThreadLocal's Value member. 这意味着当您在vector2中进行复制时，您实际上是在ThreadLocal的Value成员中复制原始int []的引用。 This is corroborated by commenting out _vector2.Value = vector2 . 通过注释掉_vector2.Value = vector2可以证实这一点。 The printed result remains the same. 打印结果保持不变。 So, I don't think this assignment is needed. 因此，我认为不需要此任务。

Now, since _vector2.Value and vector2 are referencing the same data, how is it possible that working with the local copy (vector2) is still faster? 现在，由于_vector2.Value和vector2引用的是同一数据，使用本地副本（vector2）的工作速度又如何更快？ Approximately 4 times faster in my test. 在我的测试中，速度大约提高了4倍。 Has anyone any idea what I'm missing? 有谁知道我想念的东西吗？

    class ReferenceList
    {
        const int VECTOR_LENGTH = 100000000;
        private ThreadLocal<int[]> _vector1 = new ThreadLocal<int[]>(() => Enumerable.Range(1, VECTOR_LENGTH).ToArray());
        private ThreadLocal<int[]> _vector2 = new ThreadLocal<int[]>(() => Enumerable.Range(1, VECTOR_LENGTH).ToArray());

        internal void DoWork()
        {
            int[] vector1 = _vector1.Value;
            int[] vector2 = _vector2.Value;

            for (int i = 0; i < VECTOR_LENGTH; i++)
            {
                // This is the fast way (as in the document)
                vector2[i] += vector1[i];

                // This is the slow way
                //_vector2.Value[i] += _vector1.Value[i];
            }

            // Since int[] is a reference type. This step is not needed, I think. The result is not influenced when commenting out this line
            _vector2.Value = vector2;

            Console.WriteLine($"Thread-{Thread.CurrentThread.ManagedThreadId} Result: {String.Join(", ", _vector2.Value.Take(10))}");
        }

Answer 1

vector1 is a reference directly to the array. vector1是对数组的直接引用。 Nothing is ever going to be faster than that. 没有比这更快的了。

_vector1 is not a reference directly to the array. _vector1 不是直接引用该数组。 _vector1.Value will result in the same value - but it takes it some effort to get that value, as per the source code . _vector1.Value将产生相同的值-但根据源代码，要花费一些精力才能获得该值。 Thus, every time you ask for .Value you take that performance hit (of executing methods etc) again (even though you know it will return the same value, it takes it some effort to work that out). 因此，每次您要求.Value时，都会再次打击性能（执行方法等）（即使您知道它将返回相同的值，也需要花费一些精力才能解决）。 And that is ignoring other related costs like possible reduction in data locality, increased cache misses etc. 而这忽略了其他相关成本，例如可能减少的数据局部性，增加的缓存丢失率等。

为什么虽然是引用类型，但使用ThreadLocal的Value成员的本地副本的速度更快？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-07-20 12:57:25

为什么虽然是引用类型，但使用ThreadLocal的Value成员的本地副本的速度更快？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-07-20 12:57:25

解决方案1
1 已采纳 2019-07-20 12:57:25