简体   繁体   中英

Running Multiple Tasks in Parallel

I have a list of proxies, each proxy goes to various sites and pulls the needed data from the sites. Currently it's doing this one at a time. But I'd like to have 10 - 20 tasks running at once so it's downloading from 20 sites in one go rather than just one.

Here's how I'm currently doing it:

private async Task<string> DownloadDataFromSite(string url)
{
     // (await) Do Work.
    return HTMLSourceCode;
}

I then loop through the proxies

foreach(Proxy p in proxies)
{
    string source = await DownloadDataFromSite(site);
}

Is Parallel.ForEach suitable for such a task? I've tried it, but the problem I'm having at the moment is not being able to await .

One way is to avoid awaiting in the foreach. The thing is that your await effectively blocks your execution. A better way might be something like this:

await Task.WhenAll(proxies.Select(p => DownloadDataFromSite(site)));

This will mean you'll be awaiting all the tasks at once, which means the asynchronous I/O is going to happen in parallel. Note that if you're doing CPU work too, that's not going to really be parallelized.

The point is, asynchronous I/O (such as downloading a web page) doesn't require more threads to run in parallel. On the other hand, Parallel.ForEach is actually intended for CPU-bound work, rather than I/O bound work, and it does execute the code on multiple threads.

PArallel.ForEach does not work well as it expects a synchronous lambda and giving it an asynchronous one basically causes it to return as soon as it starts. There is a way around it though, check this question out: Is it OK to do some async/await inside some .NET Parallel.ForEach() code?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM