I got a large list to loop through (1.500.000 items), with each item I have to do a very small check. Totally during 30 seconds.
The CPU utilization when using Sequential is around 10%, so there's a lot of resources not used.
The first thought was to use Parallel, but due to the limited time duration for each item, Parallel lasts longer than a sequential Foreach, this is due to " Why was the parallel version slower than the sequential version in this example? ", which explains that the creation of each task will cost time.
So I had another thought and that is to divide the list in 4 (or more) equal peaces and create a thread to loop through the items to get it faster.
Before creating my own class, is this a good approach? Or any other thoughts of how to speed things up? Or do you know a better way of handling this.
The code I created for another parallel approach: (used in my own static class)
public static void ForEach<T>(IEnumerable<T> list, Action<T> body, int listDevide)
{
// Number of items
int items = list.Count();
// Divided (in int, so floored)
int listPart = items / listDevide;
// Get numbers extra for last run
int rest = items % listDevide;
// List to save the actions
var actions = new List<Action>();
for(var x = 0; x < listDevide; x++)
{
// Create the actions
actions.Add(delegate {
foreach(var item in list.Skip(x * listPart).Take(listPart))
{
body.Invoke(item);
}
});
}
// Run the actions parallel
Parallel.Invoke(actions.ToArray());
}
Remark: "rest" variable to do the last items is currently not used in this example.
Solution below, more info: http://msdn.microsoft.com/en-us/library/dd997411.aspx
Yes, partitioning the input array is a good approach.
In fact, Microsoft provide a Partitioner
class to help with exactly this approach.
Here's an example showing how to do it:
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading.Tasks;
namespace Demo
{
class Program
{
private void run()
{
double sum = 0;
Func<double, double> func = x => Math.Sqrt(Math.Sin(x));
object locker = new object();
double[] data = testData();
// For each double in data[] we are going to calculate Math.Sqrt(Math.Sin(x)) and
// add all the results together.
//
// To do this, we use class Partitioner to split the input array into just a few partitions,
// (the Partitioner will use knowledge about the number of processor cores to optimize this)
// and then add up all the values using a separate thread for each partition.
//
// We use threadLocalState to compute the total for each partition, and then we have to
// add all these together to get the final sum. We must lock the additon because it isn't
// threadsafe, and several threads could be doing it at the same time.
Parallel.ForEach
(
Partitioner.Create(0, data.Length),
() => 0.0,
(subRange, loopState, threadLocalState) =>
{
for (int i = subRange.Item1; i < subRange.Item2; i++)
{
threadLocalState += func(data[i]);
}
return threadLocalState;
},
finalThreadLocalState =>
{
lock (locker)
{
sum += finalThreadLocalState;
}
}
);
Console.WriteLine("Sum = " + sum);
}
private static double[] testData()
{
double[] array = new double[1000003]; // Test with an odd number of values.
Random rng = new Random(12345);
for (int i = 0; i < array.Length; ++i)
array[i] = rng.Next() & 3; // Don't want large values for this simple test.
return array;
}
static void Main()
{
new Program().run();
}
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.