简体   繁体   中英

How to get C# async methods to actually run in parallel?

I have never used async methods before, but I found myself building a monitoring tool and thought that I could use this type of methods to speed things up.

The tool defines a list of SiteConfigurations , and each SiteConfiguration has a list of Monitors to be tested (via HTTP requests). The architecture of the tool is pipeline based, so for example:

  • when a monitor succeeds, the MonitorSuccessPipeline is executed
  • when a monitor fails, the MonitorFailurePipeline is executed
  • when all monitors within a site configuration have run, the SiteAfterMonitorsPipeline is executed
  • when all site configurations have run, the EndEnginePipeline is executed

The initial approach was to loop through all SiteConfigurations and, for each of them, run all associated Monitors .

Total execution time: 100 SiteConfigurations x 4 Monitors x ~1 second per monitor = ~400 seconds to run.

So I changed my RunMonitor to be async (the actual method does more than what you see here, this is the "core" of the method):

    private async Task<Result> RunMonitor(Monitor currentMonitor)
    {
        Result result = new Result();
        var client = new HttpClient();
        var response = client.GetAsync(currentMonitor.TestUrl);
        var content = await response.Result.Content.ReadAsStringAsync();
        bool containSuccessText = content.Contains(currentMonitor.SuccessText);
        bool isOk = status == HttpStatusCode.OK;
        result.Success = containSuccessText && isOk;

        if (result.Success)
        {
            ExecutePipeline("MonitorSuccessPipeline", currentMonitor);
        }
        else
        {
            ExecutePipeline("MonitorFailurePipeline", currentMonitor);
        }

        return result;
    }

Next, I have changed the RunSiteConfigurationMonitors method to be:

    private async Task<Result> RunSiteConfigurationMonitors(SiteConfiguration siteConfig)
    {
        Result result = new Result();
        List<Task<Result>> tasks = new List<Task<Result>>();
        foreach (var currentMonitor in siteConfig.Monitors)
        {
            tasks.Add(RunMonitor(siteConfig, globalConfig, currentMonitor));
        }
        var results = await Task.WhenAll(tasks);
        foreach (var r in results)
        {
            result.Insert(r.Body);
        }

        ExecutePipeline("SiteAfterMonitorsPipeline", siteConfig);

        return result;
    }

Finally, I changed the RunEngine method to be:

    public async Task RunEngine(string configName)
    {
        GlobalConfiguration globalConfig = GetConfiguration(configName);
        List<Task<Result>> tasks = new List<Task<Result>>();
        foreach (var sc in globalConfig.SiteConfigurations)
        {
            tasks.Add(RunSiteConfigurationMonitors(sc, globalConfig));
        }
        var results = await Task.WhenAll(tasks);
        foreach (var r in results)
        {
            logger.Insert(r.Body);
        }

        ExecutePipeline("EndEnginePipeline", globalConfig);
    }

I expected to see the RunEngine method call all the RunSiteConfigurationMonitors and see them run in parallel at the same time - and in turn each RunSiteConfigurationMonitors call the RunMonitor and see those run at the same time.

Instead, this is what was the output on a subset of monitors:

[21:56:50.419]: Loading configuration...
[21:56:58.480]: [21:56:52.711]: Monitoring site A
[21:56:58.583]: [21:56:53.954]: [21:56:52.753]: Found monitor: A1
[21:56:58.687]: [21:56:53.954]: [21:56:52.753]: Testing URL: http://a1.example.com
[21:56:58.791]: [21:56:53.954]: [21:56:53.106]: -- Contains success text? Yes
[21:56:58.894]: [21:56:53.954]: [21:56:53.106]: -- Status Code: OK
[21:56:58.997]: [21:56:53.954]: [21:56:53.106]: -- Success? True
[21:56:59.100]: [21:56:53.954]: 
[21:56:59.203]: [21:56:53.954]: [21:56:53.474]: Found monitor: A2
[21:56:59.306]: [21:56:53.954]: [21:56:53.474]: Testing URL: http://a2.example.com
[21:56:59.375]: [21:56:53.954]: [21:56:53.761]: -- Contains success text? Yes
[21:56:59.478]: [21:56:53.954]: [21:56:53.762]: -- Status Code: OK
[21:56:59.582]: [21:56:53.954]: [21:56:53.762]: -- Success? True
[21:56:59.686]: [21:56:53.954]: 
[21:56:59.790]: 
[21:56:59.894]: [21:56:54.126]: Monitoring site: B
[21:56:59.998]: [21:56:56.424]: [21:56:54.126]: Found monitor: B1
[21:57:00.101]: [21:56:56.424]: [21:56:54.126]: Testing URL: http://b1.example.com
[21:57:00.204]: [21:56:56.424]: [21:56:55.225]: -- Contains success text? Yes
[21:57:00.307]: [21:56:56.424]: [21:56:55.225]: -- Status Code: OK
[21:57:00.410]: [21:56:56.424]: [21:56:55.225]: -- Success? True
[21:57:00.515]: [21:56:56.424]: 
[21:57:00.619]: [21:56:56.424]: [21:56:55.428]: Found monitor: B2
[21:57:00.724]: [21:56:56.424]: [21:56:55.429]: Testing URL: http://b2.example.com
[21:57:00.827]: [21:56:56.424]: [21:56:56.254]: -- Contains success text? Yes
[21:57:00.931]: [21:56:56.424]: [21:56:56.254]: -- Status Code: OK
[21:57:01.036]: [21:56:56.424]: [21:56:56.254]: -- Success? True
[21:57:01.140]: [21:56:56.424]: 
[21:57:01.244]: 
[21:57:01.348]: [21:56:56.597]: Monitoring site: C
[21:57:01.452]: [21:56:58.206]: [21:56:56.597]: Found monitor: C1
[21:57:01.557]: [21:56:58.206]: [21:56:56.597]: Testing URL: http://c1.example.com
[21:57:01.662]: [21:56:58.206]: [21:56:57.219]: -- Contains success text? Yes
[21:57:01.766]: [21:56:58.206]: [21:56:57.219]: -- Status Code: OK
[21:57:01.869]: [21:56:58.206]: [21:56:57.219]: -- Success? True
[21:57:01.974]: [21:56:58.206]: 
[21:57:02.078]: [21:56:58.206]: [21:56:57.418]: Found monitor: C2
[21:57:02.182]: [21:56:58.206]: [21:56:57.418]: Testing URL: http://c2.example.com
[21:57:02.287]: [21:56:58.206]: [21:56:58.025]: -- Contains success text? Yes
[21:57:02.392]: [21:56:58.206]: [21:56:58.025]: -- Status Code: OK
[21:57:02.496]: [21:56:58.206]: [21:56:58.025]: -- Success? True
[21:57:02.602]: [21:56:58.206]: 
[21:57:02.706]: 

As you can see, the monitors were essentially run in sequential order, and the site configurations also were not run in "parallel".

Were my expectations wrong to begin with, or is my code faulty? Like I said, this is a new world for me, therefore I would really appreciate a simple explanation, if possible.

SOLUTION

Following Stephen Cleary's suggestion below, I changed the RunMonitor to be:

    private async Task<Result> RunMonitor(Monitor currentMonitor)
    {
        Result result = new Result();
        var client = new HttpClient();
        var response = await client.GetAsync(currentMonitor.TestUrl);
        var content = await response.Content.ReadAsStringAsync();
        var status = response.StatusCode;
        bool containSuccessText = content.Contains(currentMonitor.SuccessText);
        bool isOk = status == HttpStatusCode.OK;
        result.Success = containSuccessText && isOk;

        if (result.Success)
        {
            ExecutePipeline("MonitorSuccessPipeline", currentMonitor);
        }
        else
        {
            ExecutePipeline("MonitorFailurePipeline", currentMonitor);
        }

        return result;
    }

and now it's working in parallel!

[23:33:44.199]: Loading configuration...
[23:33:48.968]: [23:33:46.601]: Monitoring site A
[23:33:49.073]: [23:33:47.637]: [23:33:46.625]: Found monitor: A1
[23:33:49.176]: [23:33:47.637]: [23:33:46.625]: Testing URL: http://a1.example.com
[23:33:49.281]: [23:33:47.637]: [23:33:47.047]: -- Contains success text? Yes
[23:33:49.386]: [23:33:47.637]: [23:33:47.047]: -- Status Code: OK
[23:33:49.490]: [23:33:47.637]: [23:33:47.047]: -- Success? True
[23:33:49.594]: [23:33:47.637]: 
[23:33:49.667]: [23:33:47.637]: [23:33:46.692]: Found monitor: A2
[23:33:49.770]: [23:33:47.637]: [23:33:46.692]: Testing URL: http://a2.example.com
[23:33:49.874]: [23:33:47.637]: [23:33:47.461]: -- Contains success text? Yes
[23:33:49.978]: [23:33:47.637]: [23:33:47.461]: -- Status Code: OK
[23:33:50.082]: [23:33:47.637]: [23:33:47.461]: -- Success? True
[23:33:50.186]: [23:33:47.637]: 
[23:33:50.292]: 
[23:33:50.396]: [23:33:46.727]: Monitoring site B
[23:33:50.500]: [23:33:48.690]: [23:33:46.727]: Found monitor: B1
[23:33:50.604]: [23:33:48.690]: [23:33:46.727]: Testing URL: http://b1.example.com
[23:33:50.708]: [23:33:48.690]: [23:33:48.547]: -- Contains success text? Yes
[23:33:50.812]: [23:33:48.690]: [23:33:48.547]: -- Status Code: OK
[23:33:50.915]: [23:33:48.690]: [23:33:48.547]: -- Success? True
[23:33:51.019]: [23:33:48.690]: 
[23:33:51.124]: [23:33:48.690]: [23:33:46.727]: Found monitor: B2
[23:33:51.228]: [23:33:48.690]: [23:33:46.727]: Testing URL: http://b2.example.com
[23:33:51.332]: [23:33:48.690]: [23:33:48.336]: -- Contains success text? Yes
[23:33:51.437]: [23:33:48.690]: [23:33:48.336]: -- Status Code: OK
[23:33:51.541]: [23:33:48.690]: [23:33:48.336]: -- Success? True
[23:33:51.645]: [23:33:48.690]: 
[23:33:51.749]: 
[23:33:51.852]: [23:33:46.728]: Monitoring site C
[23:33:51.956]: [23:33:48.161]: [23:33:46.728]: Found monitor: C1
[23:33:52.060]: [23:33:48.161]: [23:33:46.728]: Testing URL: http://c1.example.com
[23:33:52.165]: [23:33:48.161]: [23:33:47.813]: -- Contains success text? Yes
[23:33:52.271]: [23:33:48.161]: [23:33:47.813]: -- Status Code: OK
[23:33:52.375]: [23:33:48.161]: [23:33:47.813]: -- Success? True
[23:33:52.479]: [23:33:48.161]: 
[23:33:52.583]: [23:33:48.161]: [23:33:46.760]: Found monitor: C2
[23:33:52.688]: [23:33:48.161]: [23:33:46.760]: Testing URL: http://c2.example.com
[23:33:52.793]: [23:33:48.161]: [23:33:47.987]: -- Contains success text? Yes
[23:33:52.897]: [23:33:48.161]: [23:33:47.987]: -- Status Code: OK
[23:33:53.001]: [23:33:48.161]: [23:33:47.987]: -- Success? True
[23:33:53.105]: [23:33:48.161]: 
[23:33:53.208]: 

I have to admit, though, that I didn't really understand why this worked.

One of the guidelines for asynchronous programming is to not block on asynchronous code. Consider this code in RunMonitor :

var response = client.GetAsync(currentMonitor.TestUrl);
var content = await response.Result.Content.ReadAsStringAsync();

The type of response is a task, and the next line accesses Task<T>.Result , which synchronously blocks the method until the HTTP request is complete. So this method is not operating asynchronously. What you probably want is:

var response = await client.GetAsync(currentMonitor.TestUrl);
var content = await response.Content.ReadAsStringAsync();

The comments are generally correct. The scheduling of the tasks depends on the runtime, whether it is a web application, console application etc. Alternatively, if your application is just a monitoring tool that gets executed per a schedule, you could use Parallel.ForEach() set of methods in your RunEngine() and RunSiteConfigurationMonitors instead of awaiting tasks. Each of your RunMonitor would be writing results to a ConcurrentBag<> . With Parallel.ForEach you could also control the degree of parallelism via ParallelOptions , but, again, never more that the number of threads provided by the scheduler. Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM