简体   繁体   English

实体框架可查询异步

[英]Entity Framework Queryable async

I'm working on some some Web API stuff using Entity Framework 6 and one of my controller methods is a "Get All" that expects to receive the contents of a table from my database as IQueryable<Entity> .我正在使用 Entity Framework 6 处理一些 Web API 的东西,我的控制器方法之一是“Get All”,它希望从我的数据库中接收表的内容作为IQueryable<Entity> In my repository I'm wondering if there is any advantageous reason to do this asynchronously as I'm new to using EF with async.在我的存储库中,我想知道是否有任何有利的理由异步执行此操作,因为我不熟悉使用 EF 和异步。

Basically it boils down to基本上它归结为

 public async Task<IQueryable<URL>> GetAllUrlsAsync()
 {
    var urls = await context.Urls.ToListAsync();
    return urls.AsQueryable();
 }

vs对比

 public IQueryable<URL> GetAllUrls()
 {
    return context.Urls.AsQueryable();
 }

Will the async version actually yield performance benefits here or am I incurring unnecessary overhead by projecting to a List first (using async mind you) and THEN going to IQueryable?异步版本实际上会在这里产生性能优势,还是我通过首先投影到列表(请注意使用异步)然后转到 IQueryable 来产生不必要的开销?

The problem seems to be that you have misunderstood how async/await work with Entity Framework.问题似乎是您误解了 async/await 如何与实体框架一起工作。

About Entity Framework关于实体框架

So, let's look at this code:那么,让我们看看这段代码:

public IQueryable<URL> GetAllUrls()
{
    return context.Urls.AsQueryable();
}

and example of it usage:及其用法示例:

repo.GetAllUrls().Where(u => <condition>).Take(10).ToList()

What happens there?那里会发生什么?

  1. We are getting IQueryable object (not accessing database yet) using repo.GetAllUrls()我们正在使用repo.GetAllUrls()获取IQueryable对象(尚未访问数据库repo.GetAllUrls()
  2. We create a new IQueryable object with specified condition using .Where(u => <condition>我们创建了一个新IQueryable与指定条件使用对象.Where(u => <condition>
  3. We create a new IQueryable object with specified paging limit using .Take(10)我们使用.Take(10)创建一个具有指定分页限制的新IQueryable对象
  4. We retrieve results from database using .ToList() .我们使用.ToList()从数据库中检索结果。 Our IQueryable object is compiled to sql (like select top 10 * from Urls where <condition> ).我们的IQueryable对象被编译为 sql(例如select top 10 * from Urls where <condition> )。 And database can use indexes, sql server send you only 10 objects from your database (not all billion urls stored in database)并且数据库可以使用索引,sql server 只从你的数据库中向你发送 10 个对象(并非所有的 10 亿个 url 都存储在数据库中)

Okay, let's look at first code:好的,让我们看看第一个代码:

public async Task<IQueryable<URL>> GetAllUrlsAsync()
{
    var urls = await context.Urls.ToListAsync();
    return urls.AsQueryable();
}

With the same example of usage we got:使用相同的用法示例,我们得到:

  1. We are loading in memory all billion urls stored in your database using await context.Urls.ToListAsync();我们正在使用await context.Urls.ToListAsync();将存储在您的数据库中的所有 10 亿个 url 加载到内存中await context.Urls.ToListAsync(); . .
  2. We got memory overflow.我们得到了内存溢出。 Right way to kill your server杀死服务器的正确方法

About async/await关于异步/等待

Why async/await is preferred to use?为什么首选使用 async/await? Let's look at this code:让我们看看这段代码:

var stuff1 = repo.GetStuff1ForUser(userId);
var stuff2 = repo.GetStuff2ForUser(userId);
return View(new Model(stuff1, stuff2));

What happens here?这里会发生什么?

  1. Starting on line 1 var stuff1 = ...从第 1 行开始var stuff1 = ...
  2. We send request to sql server that we want to get some stuff1 for userId我们向 sql server 发送请求,希望为userId获取一些 stuff1
  3. We wait (current thread is blocked)我们等待(当前线程被阻塞)
  4. We wait (current thread is blocked)我们等待(当前线程被阻塞)
  5. ..... .....
  6. Sql server send to us response Sql 服务器发送给我们响应
  7. We move to line 2 var stuff2 = ...我们移到第 2 行var stuff2 = ...
  8. We send request to sql server that we want to get some stuff2 for userId我们向 sql server 发送请求,希望为userId获取一些 stuff2
  9. We wait (current thread is blocked)我们等待(当前线程被阻塞)
  10. And again然后再次
  11. ..... .....
  12. Sql server send to us response Sql 服务器发送给我们响应
  13. We render view我们渲染视图

So let's look to an async version of it:所以让我们看看它的异步版本:

var stuff1Task = repo.GetStuff1ForUserAsync(userId);
var stuff2Task = repo.GetStuff2ForUserAsync(userId);
await Task.WhenAll(stuff1Task, stuff2Task);
return View(new Model(stuff1Task.Result, stuff2Task.Result));

What happens here?这里会发生什么?

  1. We send request to sql server to get stuff1 (line 1)我们向 sql server 发送请求以获取 stuff1(第 1 行)
  2. We send request to sql server to get stuff2 (line 2)我们向 sql server 发送请求以获取 stuff2(第 2 行)
  3. We wait for responses from sql server, but current thread isn't blocked, he can handle queries from another users我们等待 sql server 的响应,但当前线程没有被阻塞,他可以处理来自其他用户的查询
  4. We render view我们渲染视图

Right way to do it正确的做法

So good code here:这么好的代码在这里:

using System.Data.Entity;

public IQueryable<URL> GetAllUrls()
{
   return context.Urls.AsQueryable();
}

public async Task<List<URL>> GetAllUrlsByUser(int userId) {
   return await GetAllUrls().Where(u => u.User.Id == userId).ToListAsync();
}

Note, than you must add using System.Data.Entity in order to use method ToListAsync() for IQueryable.请注意,您必须添加using System.Data.Entity才能将方法ToListAsync()用于 IQueryable。

Note, that if you don't need filtering and paging and stuff, you don't need to work with IQueryable .请注意,如果您不需要过滤和分页等内容,则不需要使用IQueryable You can just use await context.Urls.ToListAsync() and work with materialized List<Url> .您可以只使用await context.Urls.ToListAsync()并使用物化List<Url>

There is a massive difference in the example you have posted, the first version:您发布的示例有很大的不同,第一个版本:

var urls = await context.Urls.ToListAsync();

This is bad , it basically does select * from table , returns all results into memory and then applies the where against that in memory collection rather than doing select * from table where... against the database.这很糟糕,它基本上是select * from table ,将所有结果返回到内存中,然后将where应用到内存集合中,而不是对数据库执行select * from table where...

The second method will not actually hit the database until a query is applied to the IQueryable (probably via a linq .Where().Select() style operation which will only return the db values which match the query.第二种方法实际上不会访问数据库,直到将查询应用于IQueryable (可能通过 linq .Where().Select()样式操作,该操作只会返回与查询匹配的 db 值。

If your examples were comparable, the async version will usually be slightly slower per request as there is more overhead in the state machine which the compiler generates to allow the async functionality.如果您的示例具有可比性,则每个请求的async版本通常会稍微慢一些,因为编译器生成的状态机中有更多开销以允许async功能。

However the major difference (and benefit) is that the async version allows more concurrent requests as it doesn't block the processing thread whilst it is waiting for IO to complete (db query, file access, web request etc).然而,主要区别(和好处)是async版本允许更多并发请求,因为它在等待 IO 完成(数据库查询、文件访问、Web 请求等)时不会阻塞处理线程。

Long story short,长话短说,
IQueryable is designed to postpone RUN process and firstly build the expression in conjunction with other IQueryable expressions, and then interprets and runs the expression as a whole. IQueryable旨在推迟 RUN 过程,首先与其他IQueryable表达式一起构建表达式,然后将表达式作为一个整体进行解释和运行。
But ToList() method (or a few sort of methods like that), are ment to run the expression instantly "as is".但是ToList()方法(或类似的几种方法)可以立即“按原样”运行表达式。
Your first method ( GetAllUrlsAsync ), will run imediately, because it is IQueryable followed by ToListAsync() method.您的第一个方法 ( GetAllUrlsAsync ) 将立即运行,因为它是IQueryable后跟ToListAsync()方法。 hence it runs instantly (asynchronous), and returns a bunch of IEnumerable s.因此它立即运行(异步),并返回一堆IEnumerable
Meanwhile your second method ( GetAllUrls ), won't get run.同时,您的第二种方法 ( GetAllUrls ) 不会运行。 Instead, it returns an expression and CALLER of this method is responsible to run the expression.相反,它返回一个表达式,此方法的 CALLER 负责运行该表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM