简体   繁体   中英

What is the best way to loop over async method?

I'm wondering what's the best way to loop over async method. Let's say I have a method:

public async Task<bool> DownloadThenWriteThenReturnResult(string id)
{
    // async/await stuff....
}

I want to call this method 10 000 times assuming I already have a 10 000 strings list for parameters called "_myStrings". I want 4 threads maximum to share this work (In production I'd use ProcessorCount - 1). I want to be able to cancel everything. And finally I want the result of each calls. I'd like to know what is the difference and what is the best way and why between:

*1 -

var allTasks = _myStrings.Select(st =>DownloadThenWriteThenReturnResult(st));
bool[] syncSuccs = await Task.WhenAll(syncTasks);

*2 -

await Task.Run(() =>
{
    var result = new ConcurrentQueue<V>();
    var po = new ParallelOptions(){MaxDegreeOfParallelism = 4};
    Parallel.ForEach(_myStrings, po, (st) =>
    {
        result.Enqueue(DownloadThenWriteThenReturnResult(st).Result);
        po.CancellationToken.ThrowIfCancellationRequested();
    });
});

*3 -

using (SemaphoreSlim throttler = new SemaphoreSlim(initialCount: 4))
{
    var results = new List<bool>();
    var allTasks = new List<Task>();
    foreach (var st in _myStrings)
    {
        await throttler.WaitAsync();
        allTasks.Add(Task.Run(async () =>
        {
            try
            {
                results.Add(await DownloadThenWriteThenReturnResult(st));
            }
            finally
            {
                throttler.Release();
            }
        }));
    }
    await Task.WhenAll(allTasks);
}

*4 -

var block = new TransformBlock<string, bool>(
async st =>
{
    return await DownloadThenWriteThenReturnResult(st);
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 4});

foreach (var st in _myStrings)
{
    await block.SendAsync(st);
}

var results = new List<bool>();
foreach (var st in _myStrings)
{
    results.Add(await block.ReceiveAsync());
}

Is there another way? These 4 gave me similar results whereas only *2,*3 and *4 use 4 threads. Can you confirm that:

  • *1 creates 10 000 tasks on the threadpool thread but will be executed in only one thread

  • *2 will create 4 threads T1 T2 T3 and T4. It uses .Result thus it is not async all the way (shall I avoid that here?). Since DownloadThenWriteThenReturnResult is executed in one of the 4 threads T1 T2 T3 or T4, where are the nested tasks placed (by nested tasks I mean what every async methods will return when awaited)? In a dedicated threadpool thread (let's says T11 T21 T31 and T41)?

  • Same question for *3 and *4

*4 seems to be my best shot. It's easy to understand what's going on and I'll be able to create new blocks and link them if needed. It also seems completely async. But I'd like to understand where the nested tasks from all my Async/Await code within DownLoadThenWriteThenReturnResult are executed and if it's the best way to do so.

Thanks for any hints!

I will try to answer all your questions.

My proposal

First this is what I would do. I tried to minimize the number of task and to keep the code simple.

Your problem looks like some kind of producer/consumer case. I would go with something simple like that:

public async Task Work(ConcurrentQueue<string> input, ConcurrentQueue<bool> output)
{
    string current;
    while (input.TryDequeue(out current))
    {
        output.Enqueue(await DownloadThenWriteThenReturnResult(current));
    }
}

var nbThread = 4;
var input = new ConcurrentQueue<string>(_myStrings);
var output = new ConcurrentQueue<bool>();

var workers = new List<Task>(nbThread);

for (int i = 0; i < nbThread; i++)
{
    workers.Add(Task.Run(async () => await this.Work(input, output)));
}

await Task.WhenAll(workers);

I am not sure the number of thread is correlated to the number of processor. This would be true if you were dealing with CPU-Bound operations. In such cases, you should run as synchronous as possible because the overload introduced by the system to switch from one context to another is heavy. So in that cases, one operation by thread, is the way.

But in your case, since you are most of the time waiting for I/O (network for the http call, disk for the write, etc), you could probably start more tasks in parallel. Each time a task is waiting for an I/O, the system can paused it and switch to another task. The overload here is not wasted because the thread would be waiting doing nothing on the other hand.

You should benchmark with 4, 5, 6, etc tasks and find which one is the more efficient.

One issue I could see here is that you don't know which input produced which ouput. You could use a ConcurrentDictionary instead of two ConcurrentQueue but there can't be duplicate in _myStrings .

Your solutions

Here is what I thought about your solutions.

Solution *1

As you said, it is going to create 10 000 tasks. As far as I know (but I am not an expert on that field), the system will share the ThreadPool threads among the tasks, applying some Round Robin algorithm. I think the same task can even start its execution on a first thread, be paused by the system, and finish its execution on a second thread. This will introduce more overhead than necessary and cause the overall runtimes to be slower.

I think this must absolutely be avoided!

solution *2

I read that the Parallel API does not work well with asynchronous operations. I also read plenty of times that you don't want to call .Result on a task unless absolute need.

So I would avoid this solution too.

solution *3

Honestly, I can't imagine what this will do exactly ^^. This may be a good solution, since you are not creating all the task at once. Anyway you are still going to create also 10 000 tasks so I would avoid it.

solution *4

Honestly, I don't even knew about this API, so I cannot really comment it. But since it involves a third party library, I would avoid it if possible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM