简体   繁体   中英

How to use task parallel library (TPL) with load balancing and limited degree of parallelism?

My task is to write a known nr of values to an external system by using an (async) interface. I have to limit the maximum number of parallel writes that are executed concurrently. Additionally I've got to use load balancing because it may take longer for some values to be written by that external system.

I know how to solve these problems each on it's own:

Degree of parallelism:

new ParallelOptions {MaxDegreeOfParallelism = maxNrParallelWrites}

I also stumbled over this article: http://msdn.microsoft.com/en-us/library/ee789351(v=vs.110).aspx

Load balancing:

var partitioner = Partitioner.Create(values.ToList(), true);

Task from async interface:

var writeTask = Task<AccessResult>.Factory.FromAsync(BeginWriteValue, EndWriteValue, value.SystemId, value.Xml, priority, null);



But how do I correctly combine all this techniques? I created the following code:

  int maxNrParallelWrites = GetMaxNrParallelWrites();
  var partitioner = Partitioner.Create(values.ToList(), true);
  Parallel.ForEach(partitioner, new ParallelOptions {MaxDegreeOfParallelism = maxNrParallelWrites},
    (val) =>
    {
      var writeValueTask = GetWriteValueTask(val, priority);
      Task.WaitAny(writeValueTask);
    });

I'm especially unsure about the the last part of the previous code: the action that executes the workload. Would it be better instead of creating a WriteValueTask directly use the synchronous interface like this:

(val) =>
    {
      var accessResult = externalSystem.WriteValue(....);
    }

Or is it okay to create a task and then directly wait for it (Task.WaitAny(...))?

You should use TPL Dataflow's ActionBlock that encapsulates all that for you. It's an actor based framework that is part of the TPL:

var block = new ActionBlock<Value>(
    value => GetWriteValueTask(value, priority)
    new ExecutionDataflowBlockOptions()
    {
        MaxDegreeOfParallelism = GetMaxNrParallelWrites();
    });

foreach (var value in values)
{
    block.Post(value);
}

You can set the MaxDegreeOfParallelism , BoundedCapacity and load balancing is baked in because it handles only MaxDegreeOfParallelism items at a time, and when each completes it handles the next one (as opposed to using a Partitioner that partitions the collection in advance)

Note: When you take an async task and wait for it to complete synchronously (ie Task.WaitAny ) nothing is actually asynchronous. You should be using Task.WhenAny instead in such cases.

There is a good example of how to create a load balancing ForEachASync method in this article. . I've taken out the Task.Run to avoid starting a new thread and then the extension method becomes this:

public static class Extensions
{
    public static async Task ExecuteInPartition<T>(IEnumerator<T> partition, Func<T, Task> body)
    {
        using (partition)
            while (partition.MoveNext())
                await body(partition.Current);
    }

    public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
    {
        return Task.WhenAll(
            from partition in Partitioner.Create(source).GetPartitions(dop)
            select ExecuteInPartition(partition, body));
    }
}

Usage

This example asynchronously processes a maximum of 100 emails at a time

 // Process 100 emails at a time
 return emailsToProcess.ForEachAsync(100, ProcessSingleEmail);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM