简体   繁体   English

个人 await 与 Task.WhenAll

[英]Individual await vs Task.WhenAll

I have the following two methods, who produce the same results.我有以下两种方法,它们产生相同的结果。

public static async Task<IEnumerable<RiskDetails>> ExecuteSqlStoredProcedureSelect<T>(IEnumerable<AccountInfo> linkedAccounts, string connectionString, string storedProcedure, int connTimeout = 10)
{
        var responseList = new List<RiskDetails>();

        using (IDbConnection conn = new SqlConnection(connectionString))
        {
            foreach (var account in linkedAccounts)
            {
                var enumResults = await conn.QueryAsync<RiskDetails>(storedProcedure, 
                    new { UserID = account.UserID, CasinoID = account.CasinoID, GamingServerID = account.GamingServerID, AccountNo = account.AccountNumber, Group = account.GroupCode, EmailAddress = account.USEMAIL }, 
                    commandType: CommandType.StoredProcedure);
                    
                if (enumResults != null)
                        foreach (var response in enumResults)
                            responseList.Add(response);
            }
         }

         return responseList;
    }
        
    public static async Task<IEnumerable<RiskDetails>> ExecuteSqlStoredProcedureSelectParallel<T>(IEnumerable<AccountInfo> linkedAccounts, string connectionString, string storedProcedure, int connTimeout = 10)
    {
        List<Task<IEnumerable<RiskDetails>>> tasks = new List<Task<IEnumerable<RiskDetails>>>();
        var responseList = new List<RiskDetails>();

        using (IDbConnection conn = new SqlConnection(connectionString))
        {
            conn.Open();

            foreach (var account in linkedAccounts)
            {
                var enumResults = conn.QueryAsync<RiskDetails>(storedProcedure,
                        new { UserID = account.UserID, CasinoID = account.CasinoID, GamingServerID = account.GamingServerID, AccountNo = account.AccountNumber, Group = account.GroupCode, EmailAddress = account.USEMAIL },
                        commandType: CommandType.StoredProcedure, commandTimeout: 0);

                //add task
                tasks.Add(enumResults);
            }

            //await and get results
            var results = await Task.WhenAll(tasks);
            foreach (var value in results)
                foreach (var riskDetail in value)
                    responseList.Add(riskDetail);
        }

        return responseList;
    }

My understanding of how ExecuteSqlStoredProcedureSelect executes, is as follow:我对ExecuteSqlStoredProcedureSelect如何执行的理解如下:

  • Execute Query for Account #1对帐户 #1 执行查询
  • Wait for result of query #1等待查询 #1 的结果
  • Receive Result for query #1接收查询 #1 的结果
  • Execute Query for Account #2对帐户 #2 执行查询
  • Wait for result of query #2等待查询 #2 的结果
  • etc.等等。

My understanding of how ExecuteSqlStoredProcedureSelectParallel executes, is as follow:我对ExecuteSqlStoredProcedureSelectParallel如何执行的理解如下:

  • Add all tasks to an IEnumerable instance将所有任务添加到 IEnumerable 实例
  • Call Task.WhenAll , which will start executing queries for Account #n调用Task.WhenAll ,它将开始执行 Account #n 的查询
  • Queries are executed relatively parallel against SQL server查询是相对于 SQL Server 相对并行执行的
  • Task.WhenAll returns when all queries executed Task.WhenAll 在所有查询执行时返回

From my understanding, ExecuteSqlStoredProcedureSelectParallel there should be a little improvement with this function with regards to time, but at the moment there is none.根据我的理解, ExecuteSqlStoredProcedureSelectParallel这个函数在时间上应该有一些改进,但目前没有。

Is my understanding for this wrong?我对此的理解是错误的吗?

Your understanding of ExecuteSqlStoredProcedureSelectParalel is not completely correct.您对ExecuteSqlStoredProcedureSelectParalel理解并不完全正确。

Call Task.WhenAll, which will start executing queries for Account #n调用 Task.WhenAll,它将开始执行 Account #n 的查询

Task.WhenAll does not start anything. Task.WhenAll不启动任何东西。 After QueryAsync method returns - the task has already started and is running or even completed.QueryAsync方法返回后 - 任务已经开始并且正在运行甚至完成。 When control reaches Task.WhenAll - all tasks are have already started.当控制到达Task.WhenAll - 所有任务都已经开始。

Queries are executed relatively parallel against SQL server查询是相对于 SQL Server 相对并行执行的

This is complicated subject.这是一个复杂的话题。 To be able to execute multiple queries over the same sql connection concurrently - you have MultipleActiveResultSets option be enabled in your connection string, will not work (throw exception) without that.为了能够同时在同一个 sql 连接上执行多个查询 - 您在连接字符串中启用了MultipleActiveResultSets选项,否则将无法工作(抛出异常)。

Then, in many places, including documentation , you can read that MARS is not about parallel execution.然后,在许多地方,包括文档,您可以了解到 MARS 与并行执行无关 It's about statement interleaving, which means SQL Server might switch between different statements executing over the same connection, much like OS might switch between threads (on the single core).它是关于语句交​​错的,这意味着 SQL Server 可能会在通过同一连接执行的不同语句之间切换,就像操作系统可能会在线程之间切换(在单核上)。 Quote from the above link:引自上述链接:

MARS operations execute synchronously on the server . MARS 操作在服务器上同步执行 Statement interleaving of SELECT and BULK INSERT statements is allowed. SELECT 和 BULK INSERT 语句的语句交错是允许的。 However, data manipulation language (DML) and data definition language (DDL) statements execute atomically.但是,数据操作语言 (DML) 和数据定义语言 (DDL) 语句以原子方式执行。 Any statements attempting to execute while an atomic batch is executing are blocked.任何试图在原子批处理执行时执行的语句都会被阻止。 Parallel execution at the server is not a MARS feature .服务器上的并行执行不是 MARS 功能

Now, even if your select queries execute in parallel on server, that won't help you much in terms of "perfomance", if those queries execution is fast.现在,即使您的选择查询在服务器上并行执行,如果这些查询执行速度很快,那在“性能”方面也无济于事。

Suppose you query for 10 accounts, and each query execution takes 1ms (pretty normal, I'd say expected situation).假设您查询 10 个帐户,每个查询执行需要 1 毫秒(很正常,我会说预期情况)。 But, each query returns say 100 rows.但是,每个查询都会返回 100 行。 Now, those 100 rows should be delivered over the network to the caller.现在,这 100 行应该通过网络传送给调用者。 That's the most costly part, execution time is negligible compared to that (in this specific example).这是成本最高的部分,与此相比,执行时间可以忽略不计(在此特定示例中)。 Whether you use MARS or not - you have just one physical connection with sql server.无论您是否使用 MARS - 您与 sql server 只有一个物理连接。 Even if your 10 queries are executed in parallel on server (which I doubt because of the above) - their results cannot be delivered to you in parallel, because you have one physical connection.即使您的 10 个查询在服务器上并行执行(由于上述原因,我对此表示怀疑) - 它们的结果也无法并行发送给您,因为您有一个物理连接。 So 10*100 = 1000 rows, in both cases, are delivered to you "sequentially".所以 10*100 = 1000 行,在这两种情况下,都“按顺序”交付给您。

From that it should be clear that you should not expect your Parallel version to execute noticably faster.从中可以清楚地看出,您不应期望Parallel版本的执行速度明显更快。 If you want it to be really parallel - use separate connection for each command.如果您希望它真正并行 - 为每个命令使用单独的连接。

I want to also add that number of physical cores on your machine has NO non-negligible impact on perfomance in this situation.我还想补充一点,在这种情况下,您机器上的物理内核数量对性能没有不可忽视的影响。 Asynchronous IO is not about blocking threads, and you might read in numerous places over internet.异步 IO 与阻塞线程无关,您可能会通过互联网在许多地方阅读。

Well, you understanding is correct but you need to understand underlying cores , number of physical cores you machine have.好吧,您的理解是正确的,但您需要了解底层内核,您机器拥有的物理内核数量。

You can create multiple task at a given time , but that doesnt mean that all that task run in parellel, each task represent thread and the get scheduled on physcial core, in turn one core run one thread at time.您可以在给定时间创建多个任务,但这并不意味着所有任务并行运行,每个任务代表线程并且在物理核心上调度,而一个核心一次运行一个线程。

So if you machine is having let 4 core and you created 8 thread then you mahcine will run 4 thread only, other thread will get turn when thread scheduled on core in case running thread blocked or in wait state or completed.因此,如果您的机器让 4 核并且您创建了 8 个线程,那么您的机器将仅运行 4 个线程,其他线程将在内核上调度线程时轮流运行,以防运行线程被阻塞或处于等待状态或已完成。

By above I means to say when you do parallel code you should also consider number of physical core you are having on your machine.上面我的意思是说,当您执行并行代码时,您还应该考虑机器上的物理内核数量。 that could be one reason you code is not getting advantage of parallel coding you have done.这可能是您编码没有获得您所做的并行编码优势的原因之一。

One more thing if number of cores are less then number of task/thread then there will be too much context switching when can slow down you program also.还有一件事,如果内核数少于任务/线程数,那么上下文切换就会过多,这也会减慢您的编程速度。

Adding to above , Task parallel library under the hood make use of Threadpool, and threads in thread pool recommended to use for small operation.补充上面,引擎盖下的任务并行库使用了线程池,线程池中的线程建议用于小型操作。 because long running operation may consume you thread pool and then your short running operation has to wait for thread to finish , which also slow down your application.因为长时间运行的操作可能会消耗您的线程池,然后您的短时间运行的操作必须等待线程完成,这也会降低您的应用程序的速度。 So its recommended to create task with TaskCreationOptions.LongRunning or make use of async/await so you threadpool thread not get cosume for long running operations (database operation, file read/write operation or external web/ webservcie call to get data).因此,建议使用TaskCreationOptions.LongRunning创建任务或使用async/await这样您的线程池线程就不会为长时间运行的操作(数据库操作、文件读/写操作或外部 web/webservcie 调用以获取数据)获取 cosume。


Apart from above in your code,除了上面的代码,

 var results = await Task.WhenAll(tasks);

this means wait till all task execution get completed, which means that if you have 5 task and 3 of them completed but 2 of them is taking longer time to completed , then you code will wait for that 2 long running task to complete before executing next line.这意味着等到所有任务执行完成,这意味着如果您有 5 个任务,其中 3 个已完成,但其中 2 个需要更长的时间才能完成,那么您的代码将等待 2 个长时间运行的任务完成,然后再执行下一个线。


Check this also : can a single SQL Server connection be shared among tasks executed in parallel还要检查这个: 可以在并行执行的任务之间共享单个 SQL Server 连接吗?

A SQLServer Connection can be shared by multiple tasks executing in parallel, eg threads in a C# program or requests in an app server. SQLServer 连接可以由并行执行的多个任务共享,例如 C# 程序中的线程或应用服务器中的请求。 But most use scenarios would require you to synchronize access to the Connection.但是大多数使用场景都需要您同步对 Connection 的访问。 A task will have to wait for the connection if another task is using it.如果另一个任务正在使用它,则该任务将不得不等待连接。 By the time you build a shared connection mechanism that does not break or become a performance constraint for your parallel tasks, you have likely built a connection pool.当您构建一个不会破坏或成为并行任务的性能约束的共享连接机制时,您可能已经构建了一个连接池。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM