简体   繁体   English

异步/等待性能

[英]Async/await performance

I'm working on performance optimization of the program which widely uses async/await feature. 我正在研究广泛使用async / await功能的程序的性能优化。 Generally speaking it downloads thousands of json documents through HTTP in parallel, parses them and builds some response using this data. 一般来说,它通过HTTP并行下载数千个json文档,解析它们并使用这些数据构建一些响应。 We experience some issues with performance, when we handle many requests simultaneously (eg download 1000 jsons), we can see that a simple HTTP request can take a few minutes. 我们遇到一些性能问题,当我们同时处理许多请求时(例如下载1000个jsons),我们可以看到一个简单的HTTP请求可能需要几分钟。

I wrote a small console app to test it on a simplified example: 我写了一个小型控制台应用程序来测试它的简化示例:

class Program
{
    static void Main(string[] args)
    {
        for (int i = 0; i < 100000; i++)
        {
            Task.Run(IoBoundWork);
        }

        Console.ReadKey();
    }

    private static async Task IoBoundWork()
    {
        var sw = Stopwatch.StartNew();

        await Task.Delay(1000);

        Console.WriteLine(sw.Elapsed);
    }
}

And I can see similar behavior here: 我可以在这里看到类似的行为:

在此输入图像描述

The question is why "await Task.Delay(1000)" eventually takes 23 sec. 问题是为什么“等待Task.Delay(1000)”最终需要23秒。

Task.Delay isn't broken, but you're performing 100,000 tasks which each take some time. Task.Delay没有被破坏,但你执行100,000个任务,每个任务都需要一些时间。 It's the call to Console.WriteLine that is causing the problem in this particular case. 这是对Console.WriteLine的调用导致此特定情况下的问题。 Each call is cheap, but they're accessing a shared resource, so they aren't very highly parallelizable. 每次调用都很便宜,但他们正在访问共享资源,因此它们的并行化程度不高。

If you remove the call to Console.WriteLine , all the tasks complete very quickly. 如果删除对Console.WriteLine的调用,则所有任务都会很快完成。 I changed your code to return the elapsed time that each task observes, and then print just a single line of output at the end - the maximum observed time. 我更改了代码以返回每个任务观察到的经过时间,然后在最后打印一行输出 - 最大观察时间。 On my computer, without any Console.WriteLine call, I see output of about 1.16 seconds, showing very little inefficiency: 在我的计算机上,没有任何Console.WriteLine调用,我看到大约1.16秒的输出,显示非常低效:

using System;
using System.Linq;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;

class Program
{
    static void Main(string[] args)
    {
        ThreadPool.SetMinThreads(50000, 50000);
        var tasks = Enumerable.Repeat(0, 100000)
            .Select(_ => Task.Run(IoBoundWork))
            .ToArray();
        Task.WaitAll(tasks);
        var maxTime = tasks.Max(t => t.Result);
        Console.WriteLine($"Max: {maxTime}");
    }

    private static async Task<double> IoBoundWork()
    {
        var sw = Stopwatch.StartNew();
        await Task.Delay(1000);
        return sw.Elapsed.TotalSeconds;
    }
}

You can then modify IoBoundWork to do different tasks, and see the effect. 然后,您可以修改IoBoundWork以执行不同的任务,并查看效果。 Examples of work to try: 尝试的工作示例:

  • CPU work (do something actively "hard" for the CPU, but briefly) CPU工作(为CPU做主动“硬”的事情,但简单地说)
  • Synchronous sleeping (so the thread is blocked, but the CPU isn't) 同步休眠(所以线程被阻塞,但CPU没有)
  • Synchronous IO which doesn't have any shared bottlenecks (although that's generally hard, given that the disk or network is likely to end up being a shared resource bottleneck even if you're writing to different files etc) 同步IO 没有任何共享瓶颈(尽管这通常很难,因为即使您正在写入不同的文件等,磁盘或网络很可能最终成为共享资源瓶颈)
  • Synchronous IO with a shared bottleneck such as Console.WriteLine 具有共享瓶颈的同步IO,例如Console.WriteLine
  • Asynchronous IO ( await foo.WriteAsync(...) etc) 异步IO( await foo.WriteAsync(...)等)

You can also try removing the call to Task.Delay(1000) or changing it. 您也可以尝试删除对Task.Delay(1000)的调用或更改它。 I found that by removing it entirely, the result was very small - whereas replacing it with Task.Yield was very similar to Task.Delay . 我发现,通过彻底消除它,结果是非常小的-而用替换它Task.Yield非常相似Task.Delay It's worth remembering that as soon as your async method has to actually "pause" you're effectively doubling the task scheduling problem - instead of scheduling 100,000 operations, you're scheduling 200,000. 值得记住的是,只要您的异步方法必须实际“暂停”,您就会有效地将任务调度问题加倍 - 而不是安排100,000次操作,您将调度200,000次。

You'll see a different pattern in each case. 在每种情况下,您都会看到不同的模式。 Fundamentally, you're starting 100,000 tasks, asking them all to wait for a second, then asking them all to do something. 从根本上说,你要开始100,000个任务,要求所有人等待一秒钟,然后让他们所有人做点什么。 That causes issues in terms of continuation scheduling that's async/await specific, but also plain resource management of "Performing 100,000 tasks each of which needs to write to the console is going to take a while." 这导致了异步/等待特定的持续调度方面的问题,而且“执行100,000个任务,每个任务需要写入控制台需要一段时间”的普通资源管理。

If your problem is performance, async-await is the wrong solution. 如果你的问题是性能, async-await是错误的解决方案。

async-await is all about availability. async-await就是可用性。 Availability to handle the screen and user impute, availability to handle HTTP requests, etc. 处理屏幕和用户估算的可用性,处理HTTP请求的可用性等。

The synchronization work behind async-await will use more resources and take more time than simply blocking until the operation completes. async-await后面的同步工作将使用更多资源,并且比简单阻塞需要更多时间,直到操作完成。

Your HTTP server will handle more requests because less threads will be blocked waiting for operations to complete but each request will take slightly longer. 您的HTTP服务器将处理更多请求,因为将阻止更少的线程等待操作完成,但每个请求将花费更长的时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM