简体   繁体   中英

Removing select N+1 without .Include

Consider these contrived entity objects:

public class Consumer
{
    public int Id { get; set; }
    public string Name { get; set; }
    public bool NeedsProcessed { get; set; }
    public virtual IList<Purchase> Purchases { get; set; }  //virtual so EF can lazy-load
}

public class Purchase
{
    public int Id { get; set; }
    public decimal TotalCost { get; set; }
    public int ConsumerId { get; set; }
}

Now let's say I want to run this code:

var consumers = Consumers.Where(consumer => consumer.NeedsProcessed);

//assume that ProcessConsumers accesses the Consumer.Purchases property
SomeExternalServiceICannotModify.ProcessConsumers(consumers);

By default this will suffer from Select N+1 inside the ProcessConsumers method. It will trigger a query when it enumerates the consumers, then it'll grab each purchases collection 1 by 1. The standard solution to this problem would be to add an include:

var consumers = Consumers.Include("Purchases").Where(consumer => consumer.NeedsProcessed);

//assume that ProcessConsumers accesses the Consumer.Purchases property
SomeExternalServiceICannotModify.ProcessConsumers(consumers);

That works fine in many cases, but in some complex cases, an include can utterly destroy performance by orders of magnitude. Is it possible to do something like this:

  1. Grab my consumers, var consumers = _entityContext.Consumers.Where(...).ToList()
  2. Grab my purchases, var purchases = _entityContext.Purchases.Where(...).ToList()
  3. Hydrate the consumer.Purchases collections manually from the purchases I already loaded into memory. Then when I pass it to ProcessConsumers it won't trigger more db queries.

I'm not sure how to do #3. If you try to access any consumer.Purchases collection that'll trigger the lazy load (and thus the Select N+1). Perhaps I need to cast the Consumers to the proper type (instead of the EF proxy type) and then load the collection? Something like this:

foreach (var consumer in Consumers)
{
     //since the EF proxy overrides the Purchases property, this doesn't really work, I'm trying to figure out what would
     ((Consumer)consumer).Purchases = purchases.Where(x => x.ConsumerId = consumer.ConsumerId).ToList();
}

EDIT: I have re-written the example a bit to hopefully reveal the issue more clearly.

If I'm understanding correctly, you would like to load both a filtered subset of Consumers each with a filtered subset of their Purchases in 1 query. If that's not correct, please forgive my understanding of your intent. If that is correct, you could do something like:

var consumersAndPurchases = db.Consumers.Where(...)
    .Select(c => new {
        Consumer = c,
        RelevantPurchases = c.Purchases.Where(...)
    })
    .AsNoTracking()
    .ToList(); // loads in 1 query

// this should be OK because we did AsNoTracking()
consumersAndPurchases.ForEach(t => t.Consumer.Purchases = t.RelevantPurchases);

CannotModify.Process(consumersAndPurchases.Select(t => t.Consumer));

Note that this WON'T work if the Process function is expecting to modify the consumer object and then commit those changes back to the database.

EF will populate the consumer.Purchases collections for you, if you use the same context to fetch both collections:

List<Consumer> consumers = null;
using ( var ctx = new XXXEntities() )
{
  consumers = ctx.Consumers.Where( ... ).ToList();

  // EF will populate consumers.Purchases when it loads these objects
  ctx.Purchases.Where( ... ).ToList();
}

// the Purchase objects are now in the consumer.Purchases collections
var sum = consumers.Sum( c => c.Purchases.Sum( p => p.TotalCost ) );

EDIT :

This results in just 2 db calls: 1 to get the collection of Consumers and 1 to get the collection of Purchases .

EF will look at each Purchase record returned and look up the corresponding Consumer record from Purchase.ConsumerId . It will then add the Purchase object to the Consumer.Purchases collection for you.


Option 2:

If there is some reason you want to fetch two lists from different contexts and then link them, I would add another property to the Consumer class:

partial class Consumer
{
  public List<Purchase> UI_Purchases { get; set; }
}

You can then set this property from the Purchases collection and use it in your UI.

Grab my consumers

var consumers = _entityContext.Consumers
                              .Where(consumer => consumer.Id > 1000)
                              .ToList();

Grab my purchases

var purchases = consumers.Select(x => new {
                                       Id = x.Id,
                                       IList<Purchases> Purchases = x.Purchases         
                                       })
                         .ToList()
                         .GroupBy(x => x.Id)
                         .Select( x => x.Aggregate((merged, next) => merged.Merge(next)))
                         .ToList();

Hydrate the consumer.Purchases collections manually from the purchases I already loaded into memory.

for(int i = 0; i < costumers.Lenght; i++)
   costumers[i].Purchases = purchases[i];

Would it not be possible for you to work around the many-roundtrips-or-inefficient-query-generation problem by doing the work on the database - essentially by returning a projection instead of a particular entity, as demonstrated below:

var query = from c in db.Consumers
            where c.Id > 1000
            select new { Consumer = c, Total = c.Purchases.Sum( p => p.TotalCost ) };
var total = query.Sum( cp => cp.Total );

I'm not an EF expert by any means, so forgive me if this technique is not appropriate.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM