简体   繁体   中英

Why does List.Sum() perform poorly as compared to a foreach?

Question: Why does the execution time of Sum() take much longer than a foreach() in the following scenario?

public void TestMethod4()
{
    List<int> numbers = new List<int>();
    for (int i = 0; i < 1000000000; i++)
    {
        numbers.Add(i);
    }

    Stopwatch sw = Stopwatch.StartNew();
    long totalCount = numbers.Sum(num => true ? 1 : 0); // simulating a dummy true condition
    sw.Stop();
    Console.WriteLine("Time taken Sum() : {0}ms", sw.Elapsed.TotalMilliseconds);

    sw = Stopwatch.StartNew();
    totalCount = 0;
    foreach (var num in numbers)
    {
        totalCount += true ? 1 : 0; // simulating a dummy true condition
    }
    sw.Stop();
    Console.WriteLine("Time taken foreach() : {0}ms", sw.Elapsed.TotalMilliseconds);
}

Sample run1

Time taken Sum()     : 21443.8093ms
Time taken foreach() : 4251.9795ms

TL;DR : The difference in times is caused by the CLR applying two separate optimizations in the second case, but not the first case:

  • Linq's Sum operates on IEnumerable<T> , not List<T> .
    • The CLR/JIT does have a special-case optimization for foreach with List<T> , but not if a List<T> is passed as IEnumerable<T> .
      • Which means it's using IEnumerator<T> and incurring the cost of all of virtual-calls associated with that.
      • Whereas using List<T> directly uses static calls (instance method calls are still "static" calls, provided they're not virtual).
  • Linq's Sum accepts a delegate Func<T,Int64> .
    • Functions passed as a delegate Func<T,Int64> are not inlined, even with MethodImplOptions.AggressiveInline .
    • The cost of a delegate invocation are slightly more expensive than virtual calls.

I've reimplemented your SUM program using a variety of different approaches which you access here:https://gist.github.com/Jehoel/1a4fcd2e70374d3694c3a105061a6d1c

My benchmark results (Release build, x64, .NET Core 5, i7-7700HQ):

Approach Time (ms)
Test_Sum_Delegate 118ms
Test_MySum_DirectFunc_IEnum 112ms
Test_MySum_IndirectFunc_IEnum 114ms
Test_MySum_DirectCall_IEnum 89ms
Test_MySum_DirectFunc_List 58ms
Test_MySum_IndirectFunc_List 58ms
Test_MySum_DirectCall_List 37ms
Test_Sum_DelegateLambda 109ms
Test_For_Inline 4ms
Test_For_Delegate 3ms
Test_ForUnrolled_Inline 4ms
Test_ForUnrolled_Delegate 4ms
Test_ForEach_Inline 38ms
Test_ForEach_Delegate 37ms

We can isolate the different behaviour by changing one thing at a time (eg foreach vs for , IEnumerable<T> vs List<T> , Func<T> vs direct function calls).

The System.Linq.Enumerable.Sum approach ( Test_Sum_Delegate ) is identical to Test_MySum_IndirectFunc_IEnum (disregard the 4ms difference between them). Both of these approaches iterate over the List<T> using IEnumerable<T> .

Changing the method to pass the List<T> as a List<T> instead of IEnumerable<T> (in Test_MySum_IndirectFunc_List ) eliminates the virtual-calls from foreach using IEnumerator<T> which causes a reduction from ~114ms to 58ms , a 50% reduction in time already.

Then changing the Func<Int64,Int64> (a delegate) call to the GetValue func to a "static" call to GetValue (as in Test_MySum_DirectCall_List ) brings the time down to 37ms - which is the same as Test_ForEach_Delegate . This approach is the same as your hand-written foreach loop.

The only way to get faster performance is with a for loop without any virtual calls. (In debug builds the Unrolled loop is even faster than the normal for loop, but in Release builds there's no observed difference).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM