简体   繁体   中英

Short circuit waiting on all tasks in any order

I need to make a series of calls to a web service to get a set of node counts. I am making all calls in parallel, asynchronously. After I initiate the calls, I sum the results like so:

//pendingTasks is List<Task<int>>
int sum = 0;
foreach (var task in pendingTasks)
{
    sum += await task;
    if (sum > 100) break;
}

The break is there because I don't care about the specific count after it exceeds 100.

First off, is it dangerous to break out of the loop like that? Is it bad to leave pending tasks? Will it create a memory leak of any kind?

Secondly, the individual calls are fairly inconsistent. I would hate it if the first call is the longest one, and I end up waiting on it, even if all the subsequent ones sum to more than 100. It would be nice to add to the sum as individual results come back, in the order they are received. I have used WhenAll before, and I am pretty sure WhenAny Is what I want, but I am not quite sure how to use it in this kind of scenario, where I want to process multiple as they come in, and then terminate when they are all done.

First off, is it dangerous to break out of the loop like that? Is it bad to leave pending tasks? Will it create a memory leak of any kind?

It won't create a memory leak, but if the tasks have cancellation tokens associated with them, it would be useful to cancel them to avoid extra work being done unnecessarily.

In .NET 4, tasks which faulted but didn't have those faults "observed" would bring down a process by default (deliberately) but this has been relaxed for .NET 4.5.

I have used WhenAll before, and I am pretty sure WhenAny Is what I want, but I am not quite sure how to use it in this kind of scenario, where I want to process multiple as they come in, and then terminate when they are all done.

WhenAny certainly could work for you here, but another approach is to "magically" reorder the tasks so that you can just iterate over them in the order in which they complete. Or rather, iterate over new tasks which get the same result as the original tasks in the order in which they complete.

I wrote a blog post on that very topic a while ago - although it wasn't my idea. Basically you create a bunch of TaskCompletionSource objects - one for each original task - and add a continuation to each original task, to populate the "next available" task completion source.

For an example of how you could use WhenAny , you could look at my majority voting blog post - but the downside of that is having quite a lot of collection manipulation, and n calls to WhenAny . The "magic reordering" creates one new collection of tasks, but then just attaches a continuation to each original task, so there's nothing which really waits for all of them... you can just iterate over them one at a time, waiting for each one in turn.

Jon Skeet's answer and the link to Stephen Toub's approach are both great.

An alternative is to use a BufferBlock<int> from the TPL DataFlow library. If you had access to edit the parameters of your Tasks, you could simply pass in a BufferBlock and Post your results:

var buffer = new BufferBlock<int>();

//Run your tasks somehow like so:
YourAsyncFunctionThatPostsAnInt(buffer, cancellationTokenSource.Token)
...

int sum = 0;
while(sum < 100)
{
    sum += await buffer.ReceiveAsync()
}

buffer.Complete();
cancellationTokenSource.Cancel();

Even with the existing Tasks, you could add continuations for them to Post their result to the buffer. A cancellation token is the best way to short circuit their execution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM