Question: Why does the execution time of Sum()
take much longer than a foreach()
in the following scenario?
public void TestMethod4()
{
List<int> numbers = new List<int>();
for (int i = 0; i < 1000000000; i++)
{
numbers.Add(i);
}
Stopwatch sw = Stopwatch.StartNew();
long totalCount = numbers.Sum(num => true ? 1 : 0); // simulating a dummy true condition
sw.Stop();
Console.WriteLine("Time taken Sum() : {0}ms", sw.Elapsed.TotalMilliseconds);
sw = Stopwatch.StartNew();
totalCount = 0;
foreach (var num in numbers)
{
totalCount += true ? 1 : 0; // simulating a dummy true condition
}
sw.Stop();
Console.WriteLine("Time taken foreach() : {0}ms", sw.Elapsed.TotalMilliseconds);
}
Sample run1
Time taken Sum() : 21443.8093ms
Time taken foreach() : 4251.9795ms
TL;DR : The difference in times is caused by the CLR applying two separate optimizations in the second case, but not the first case:
Sum
operates on IEnumerable<T>
, not List<T>
.
foreach
with List<T>
, but not if a List<T>
is passed as IEnumerable<T>
.
IEnumerator<T>
and incurring the cost of all of virtual-calls associated with that.List<T>
directly uses static calls (instance method calls are still "static" calls, provided they're not virtual).Sum
accepts a delegate Func<T,Int64>
.
Func<T,Int64>
are not inlined, even with MethodImplOptions.AggressiveInline
. I've reimplemented your SUM
program using a variety of different approaches which you access here:https://gist.github.com/Jehoel/1a4fcd2e70374d3694c3a105061a6d1c
My benchmark results (Release build, x64, .NET Core 5, i7-7700HQ):
Approach | Time (ms) |
---|---|
Test_Sum_Delegate | 118ms |
Test_MySum_DirectFunc_IEnum | 112ms |
Test_MySum_IndirectFunc_IEnum | 114ms |
Test_MySum_DirectCall_IEnum | 89ms |
Test_MySum_DirectFunc_List | 58ms |
Test_MySum_IndirectFunc_List | 58ms |
Test_MySum_DirectCall_List | 37ms |
Test_Sum_DelegateLambda | 109ms |
Test_For_Inline | 4ms |
Test_For_Delegate | 3ms |
Test_ForUnrolled_Inline | 4ms |
Test_ForUnrolled_Delegate | 4ms |
Test_ForEach_Inline | 38ms |
Test_ForEach_Delegate | 37ms |
We can isolate the different behaviour by changing one thing at a time (eg foreach
vs for
, IEnumerable<T>
vs List<T>
, Func<T>
vs direct function calls).
The System.Linq.Enumerable.Sum
approach ( Test_Sum_Delegate
) is identical to Test_MySum_IndirectFunc_IEnum
(disregard the 4ms
difference between them). Both of these approaches iterate over the List<T>
using IEnumerable<T>
.
Changing the method to pass the List<T>
as a List<T>
instead of IEnumerable<T>
(in Test_MySum_IndirectFunc_List
) eliminates the virtual-calls from foreach
using IEnumerator<T>
which causes a reduction from ~114ms
to 58ms
, a 50% reduction in time already.
Then changing the Func<Int64,Int64>
(a delegate) call to the GetValue
func to a "static" call to GetValue
(as in Test_MySum_DirectCall_List
) brings the time down to 37ms
- which is the same as Test_ForEach_Delegate
. This approach is the same as your hand-written foreach
loop.
The only way to get faster performance is with a for
loop without any virtual calls. (In debug builds the Unrolled loop is even faster than the normal for
loop, but in Release builds there's no observed difference).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.