简体   繁体   中英

Using Parallel.For for searching for minimum/maximum value

object _ = new object();
List<int> list = new List<int>(new int {1,25,3....18,255}); //random values
int bestIndex = 0;
int bestValue = int.MinValue;
Parallel.For(0, list.Length, (i) => {
    if (list[i] > bestValue)
    {
        lock (_)
        {
            bestValue = list[i];
            index = i;
        }
    }
});

My question is, does it make sense. Because I suspect that simply in certain scenarios lower value will be assigned even though it shouldn't.

your code is not threadsafe since you are reading and writing to the same value from multiple threads. If you fix this and put the check inside the lock you do not gain any parellelism, and worse, are using a highly contended lock in a very tight loop, this will most likely cause significant overhead. I would also recommend against using _ as a varable name, since it is used for Discard in c#7.

The correct solution is to run multiple independent searches in parallel, and then do a final comparison of the results from each thread.

         var globalMin = int.MaxValue;
         var lockObj = new object();
         Parallel.ForEach(list,
             // LocalInit, runs once for each thread
             () => int.MaxValue, 
             // The parallel body, runs on multiple threads
             (value, _, localMin) =>
             {
                 if (value < localMin)
                 {
                     return value;
                 }
                 return localMin;
             }, 
             // Local finally, runs once for each thread,
             // given the final result produced on that thread
             localMin =>
             {
                 lock (lockObj)
                 {
                     if (localMin < globalMin)
                     {
                         globalMin = localMin;
                     }
                 }

             });
         return globalMin;

This is however a bit long and complex, an alternative would be to use linq:

list.AsParallel().Min();

edit: I wanted to add that using a parallel algorithm for such a simple task is unlikely to gain much performance, at least for primitive types. Running things in parallel is most useful when doing a decent amount of work at a time, so that the synchronization overhead remains a small proportion of the overall work. You might do some manual partitioning of the list to ensure each iteration does more work to improve performance a bit. But unless you have a really huge list it is usually not worth the effort.

Parallel.For wasn't built for such operations. That's the job of PLINQ. You can replace your code with just:

var min=list.AsParallel().Min();

This won't need any locks. It will partition the data into roughly as many partitions as there are cores, find the minimum in each partition then calculate the minimum across all partition minimums.

By not synchronizing with one global lock, this will result in near-optimal performance

On top of what JonasH said in their answer , I would like to add that using the Parallel class without specifying the MaxDegreeOfParallelism through the ParallelOptions is not a good idea. The default MaxDegreeOfParallelism is -1 , which means unbounded parallelism, in practice bounded by the ThreadPool availability. In other words calling any method of the Parallel class results (by default) to the ThreadPool becoming saturated for the whole duration of the parallel execution. Having a saturated ThreadPool is problematic when there are other parallel/asynchronous operations also happening concurrently. As an example the Elapsed event of the System.Timers.Timer class is invoked on the ThreadPool , so this event will behave hectically during the Parallel execution, because there will be no available ThreadPool thread to serve the event handler in a timely manner.

Here is how to configure the MaxDegreeOfParallelism of a Parallel loop, in order to mitigate the ThreadPool -starvation problem:

var parallelOptions = new ParallelOptions()
{
    MaxDegreeOfParallelism = Environment.ProcessorCount
};
Parallel.ForEach(list, parallelOptions, // ... It's OK now

In the above example the Parallel.ForEach loop will use concurrently at maximum Environment.ProcessorCount - 1 thread-pool threads, plus the current thread.

This advice does not apply to PLINQ , which has a default max degree of parallelism equal to the number of the logical processors of the machine.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM