My task is to write a known nr of values to an external system by using an (async) interface. I have to limit the maximum number of parallel writes that are executed concurrently. Additionally I've got to use load balancing because it may take longer for some values to be written by that external system.
I know how to solve these problems each on it's own:
Degree of parallelism:
new ParallelOptions {MaxDegreeOfParallelism = maxNrParallelWrites}
I also stumbled over this article: http://msdn.microsoft.com/en-us/library/ee789351(v=vs.110).aspx
Load balancing:
var partitioner = Partitioner.Create(values.ToList(), true);
Task from async interface:
var writeTask = Task<AccessResult>.Factory.FromAsync(BeginWriteValue, EndWriteValue, value.SystemId, value.Xml, priority, null);
But how do I correctly combine all this techniques? I created the following code:
int maxNrParallelWrites = GetMaxNrParallelWrites();
var partitioner = Partitioner.Create(values.ToList(), true);
Parallel.ForEach(partitioner, new ParallelOptions {MaxDegreeOfParallelism = maxNrParallelWrites},
(val) =>
{
var writeValueTask = GetWriteValueTask(val, priority);
Task.WaitAny(writeValueTask);
});
I'm especially unsure about the the last part of the previous code: the action that executes the workload. Would it be better instead of creating a WriteValueTask directly use the synchronous interface like this:
(val) =>
{
var accessResult = externalSystem.WriteValue(....);
}
Or is it okay to create a task and then directly wait for it (Task.WaitAny(...))?
You should use TPL Dataflow's ActionBlock
that encapsulates all that for you. It's an actor based framework that is part of the TPL:
var block = new ActionBlock<Value>(
value => GetWriteValueTask(value, priority)
new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = GetMaxNrParallelWrites();
});
foreach (var value in values)
{
block.Post(value);
}
You can set the MaxDegreeOfParallelism
, BoundedCapacity
and load balancing is baked in because it handles only MaxDegreeOfParallelism
items at a time, and when each completes it handles the next one (as opposed to using a Partitioner
that partitions the collection in advance)
Note: When you take an async
task and wait for it to complete synchronously (ie Task.WaitAny
) nothing is actually asynchronous. You should be using Task.WhenAny
instead in such cases.
There is a good example of how to create a load balancing ForEachASync
method in this article. . I've taken out the Task.Run
to avoid starting a new thread and then the extension method becomes this:
public static class Extensions
{
public static async Task ExecuteInPartition<T>(IEnumerator<T> partition, Func<T, Task> body)
{
using (partition)
while (partition.MoveNext())
await body(partition.Current);
}
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select ExecuteInPartition(partition, body));
}
}
Usage
This example asynchronously processes a maximum of 100 emails at a time
// Process 100 emails at a time
return emailsToProcess.ForEachAsync(100, ProcessSingleEmail);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.