简体   繁体   English

删除选择N + 1而不包含.Include

[英]Removing select N+1 without .Include

Consider these contrived entity objects: 考虑这些人为的实体对象:

public class Consumer
{
    public int Id { get; set; }
    public string Name { get; set; }
    public bool NeedsProcessed { get; set; }
    public virtual IList<Purchase> Purchases { get; set; }  //virtual so EF can lazy-load
}

public class Purchase
{
    public int Id { get; set; }
    public decimal TotalCost { get; set; }
    public int ConsumerId { get; set; }
}

Now let's say I want to run this code: 现在让我们说我想运行这段代码:

var consumers = Consumers.Where(consumer => consumer.NeedsProcessed);

//assume that ProcessConsumers accesses the Consumer.Purchases property
SomeExternalServiceICannotModify.ProcessConsumers(consumers);

By default this will suffer from Select N+1 inside the ProcessConsumers method. 默认情况下,这会在ProcessConsumers方法中遇到Select N + 1。 It will trigger a query when it enumerates the consumers, then it'll grab each purchases collection 1 by 1. The standard solution to this problem would be to add an include: 它会在枚举消费者时触发查询,然后它会将每个购买集合1抓取1.此问题的标准解决方案是添加一个包含:

var consumers = Consumers.Include("Purchases").Where(consumer => consumer.NeedsProcessed);

//assume that ProcessConsumers accesses the Consumer.Purchases property
SomeExternalServiceICannotModify.ProcessConsumers(consumers);

That works fine in many cases, but in some complex cases, an include can utterly destroy performance by orders of magnitude. 在许多情况下,这种方法很好,但在某些复杂情况下,包含可以完全破坏性能数量级。 Is it possible to do something like this: 有可能做这样的事情:

  1. Grab my consumers, var consumers = _entityContext.Consumers.Where(...).ToList() 抓住我的消费者,var consumers = _entityContext.Consumers.Where(...)。ToList()
  2. Grab my purchases, var purchases = _entityContext.Purchases.Where(...).ToList() 抓取我的购买,var purchases = _entityContext.Purchases.Where(...)。ToList()
  3. Hydrate the consumer.Purchases collections manually from the purchases I already loaded into memory. 为消费者提供水合。从我已加载到内存中的购买中手动购买集合。 Then when I pass it to ProcessConsumers it won't trigger more db queries. 然后,当我将它传递给ProcessConsumers时,它不会触发更多的数据库查询。

I'm not sure how to do #3. 我不知道怎么做#3。 If you try to access any consumer.Purchases collection that'll trigger the lazy load (and thus the Select N+1). 如果您尝试访问任何consumer.Purchases集合,它将触发延迟加载(因此选择N + 1)。 Perhaps I need to cast the Consumers to the proper type (instead of the EF proxy type) and then load the collection? 也许我需要将消费者转换为正确的类型(而不是EF代理类型),然后加载集合? Something like this: 像这样的东西:

foreach (var consumer in Consumers)
{
     //since the EF proxy overrides the Purchases property, this doesn't really work, I'm trying to figure out what would
     ((Consumer)consumer).Purchases = purchases.Where(x => x.ConsumerId = consumer.ConsumerId).ToList();
}

EDIT: I have re-written the example a bit to hopefully reveal the issue more clearly. 编辑:我已经重新编写了一些示例,希望能更清楚地揭示问题。

If I'm understanding correctly, you would like to load both a filtered subset of Consumers each with a filtered subset of their Purchases in 1 query. 如果我理解正确,您希望在1个查询中加载已过滤的消费者子集,每个消费者都有一个已过滤的购买子集。 If that's not correct, please forgive my understanding of your intent. 如果这不正确,请原谅我对你的意图的理解。 If that is correct, you could do something like: 如果这是正确的,你可以这样做:

var consumersAndPurchases = db.Consumers.Where(...)
    .Select(c => new {
        Consumer = c,
        RelevantPurchases = c.Purchases.Where(...)
    })
    .AsNoTracking()
    .ToList(); // loads in 1 query

// this should be OK because we did AsNoTracking()
consumersAndPurchases.ForEach(t => t.Consumer.Purchases = t.RelevantPurchases);

CannotModify.Process(consumersAndPurchases.Select(t => t.Consumer));

Note that this WON'T work if the Process function is expecting to modify the consumer object and then commit those changes back to the database. 请注意,如果Process函数希望修改使用者对象,然后将这些更改提交回数据库,则这将不起作用。

EF will populate the consumer.Purchases collections for you, if you use the same context to fetch both collections: 如果您使用相同的上下文来获取两个集合,EF将填充consumer.Purchases集合。

List<Consumer> consumers = null;
using ( var ctx = new XXXEntities() )
{
  consumers = ctx.Consumers.Where( ... ).ToList();

  // EF will populate consumers.Purchases when it loads these objects
  ctx.Purchases.Where( ... ).ToList();
}

// the Purchase objects are now in the consumer.Purchases collections
var sum = consumers.Sum( c => c.Purchases.Sum( p => p.TotalCost ) );

EDIT : 编辑:

This results in just 2 db calls: 1 to get the collection of Consumers and 1 to get the collection of Purchases . 这导致只有2个db调用:1个用于获取Consumers的集合,1个用于获取Purchases集合。

EF will look at each Purchase record returned and look up the corresponding Consumer record from Purchase.ConsumerId . EF将查看返回的每个Purchase记录,并从Purchase.ConsumerId查找相应的Consumer记录。 It will then add the Purchase object to the Consumer.Purchases collection for you. 然后,它会将Purchase对象添加到Consumer.Purchases集合中。


Option 2: 选项2:

If there is some reason you want to fetch two lists from different contexts and then link them, I would add another property to the Consumer class: 如果有一些原因你想从不同的上下文中获取两个列表然后链接它们,我会在Consumer类中添加另一个属性:

partial class Consumer
{
  public List<Purchase> UI_Purchases { get; set; }
}

You can then set this property from the Purchases collection and use it in your UI. 然后,您可以从Purchases集合中设置此属性,并在UI中使用它。

Grab my consumers 抓住我的消费者

var consumers = _entityContext.Consumers
                              .Where(consumer => consumer.Id > 1000)
                              .ToList();

Grab my purchases 抓住我的购买

var purchases = consumers.Select(x => new {
                                       Id = x.Id,
                                       IList<Purchases> Purchases = x.Purchases         
                                       })
                         .ToList()
                         .GroupBy(x => x.Id)
                         .Select( x => x.Aggregate((merged, next) => merged.Merge(next)))
                         .ToList();

Hydrate the consumer.Purchases collections manually from the purchases I already loaded into memory. 为消费者提供水合。从我已加载到内存中的购买中手动购买集合。

for(int i = 0; i < costumers.Lenght; i++)
   costumers[i].Purchases = purchases[i];

Would it not be possible for you to work around the many-roundtrips-or-inefficient-query-generation problem by doing the work on the database - essentially by returning a projection instead of a particular entity, as demonstrated below: 您是否有可能通过在数据库上进行工作来解决多次往返或低效查询生成问题 - 主要是通过返回投影而不是特定实体,如下所示:

var query = from c in db.Consumers
            where c.Id > 1000
            select new { Consumer = c, Total = c.Purchases.Sum( p => p.TotalCost ) };
var total = query.Sum( cp => cp.Total );

I'm not an EF expert by any means, so forgive me if this technique is not appropriate. 我无论如何都不是EF专家,如果这种技术不合适,请原谅我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM