LINQ Query optimalisation using EF6

Question

I'm trying my hand at LINQ for the first time and just wanted to post a small question to make sure if this was the best way to go about it. I want a list of every value in a table. So far this is what I have, and it works, but is this the best way to go about collecting everything in a LINQ friendly way?

    public static List<Table1> GetAllDatainTable()
    {
        List<Table1> Alldata = new List<Table1>();

        using (var context = new EFContext())
        {
           Alldata = context.Tablename.ToList();
        }
        return Alldata;
    }

Answer 1

For simple entities, that is an entity that has no references to other entities (navigation properties) your approach is essentially fine. It can be condensed down to:

public static List<Table1> GetAllDatainTable()
{
    using (var context = new EFContext())
    {
       return context.Table1s.ToList();
    }
}

However, in most real-world scenarios you are going to want to leverage things like navigation properties for the relationships between entities. Ie an Order references a Customer with Address details, and contains OrderLines which each reference a Product, etc. Returning entities this way becomes problematic because any code that accepts the entities returned by a method like this should be getting either complete, or completable entities.

For instance if I have a method that returns an order, and I have various code that uses that order information: Some of that code might try to get info about the order's customer, other code might be interested in the products. EF supports lazy loading so that related data can be pulled if, and when needed, however that only works within the lifespan of the DbContext. A method like this disposes the DbContext so Lazy Loading is off the cards.

One option is to eager load everything:

using (var context = new EFContext())
{
    var order = context.Orders
        .Include(o => o.Customer)
            .ThenInclude(c => c.Addresses)
        .Include(o => o.OrderLines)
            .ThenInclude(ol => ol.Product)
        .Single(o => o.OrderId == orderId);
    return order;
}

However, there are two drawbacks to this approach. Firstly, it means loading considerably more data every time we fetch an order. The consuming code may not care about the customer or order lines, but we've loaded it all anyways. Secondly, as systems evolve, new relationships may be introduced that older code won't necessarily be noticed to be updated to include leading to potential NullReferenceException s, bugs, or performance issues when more and more related data gets included. The view or whatever is initially consuming this entity may not expect to reference these new relationships, but once you start passing around entities to views, from views, and to other methods, any code accepting an entity should expect to rely on the fact that the entity is complete or can be made complete. It can be a nightmare to have an Order potentially loaded in various levels of "completeness" and code handling whether data is loaded or not. As a general recommendation, I advise not to pass entities around outside of the scope of the DbContext that loaded them.

The better solution is to leverage projection to populate view models from the entities suited to your code's consumption. WPF often utilizes the MVVM pattern, so this means using EF's Select method or Automapper's ProjectTo method to populate view models based each of your consumer's needs. When your code is working with ViewModels containing the data views and such need, then loading and populating entities as needed this allows you to produce far more efficient (fast) and resilient queries to get data out.

If I have a view that lists orders with a created date, customer name, and list of products /w quantities we define a view model for the view:

[Serializable]
public class OrderSummary
{
    public int OrderId { get; set; }
    public string OrderNumber { get; set; }
    public DateTime CreatedAt { get; set; }
    public string CustomerName { get; set; }
    public ICollection<OrderLineSummary> OrderLines { get; set; } = new List<OrderLineSummary>();
}

[Serializable]
public class OrderLineSummary
{
    public int OrderLineId { get; set; }
    public int ProductId { get; set; }
    public string ProductName { get; set; }
    public int Quantity { get; set; }
}

then project the view models in the Linq query:

using (var context = new EFContext())
{
    var orders = context.Orders
        // add filters & such /w Where() / OrderBy() etc.
        .Select(o => new OrderSummary
        {
            OrderId = o.OrderId,
            OrderNumber = o.OrderNumber,
            CreatedAt = o.CreatedAt,
            CustomerName = o.Customer.Name,
            OrderLines = o.OrderLines.Select( ol => new OrderLineSummary
            {
                OrderLineId = ol.OrderLineId,
                ProductId = ol.Product.ProductId,
                ProductName = ol.Product.Name,
                Quantity = ol.Quantity
            }).ToList()
        }).ToList();
    return orders;
}

Note that we don't need to worry about eager loading related entities, and if later down the road an order or customer or such gains new relationships, the above query will continue to work, only being updated if the new relationship information is useful for the view(s) it serves. It can compose a faster, less memory intensive query fetching fewer fields to be passed over the wire from the database to the application, and indexes can be employed to tune this even further for high-use queries.

Update:

Additional performance tips: Generally avoid methods like GetAll*() as a lowest common denominator method. Far too many performance issues I come across with methods like this are in the form of:

var ordersToShip = GetAllOrders()
    .Where(o => o.OrderStatus == OrderStatus.Pending)
    .ToList();
foreach(order in ordersToShip)
{
    // do something that only needs order.OrderId.
}

Where GetAllOrders() returns List<Order> or IEnumerable<Order> . Sometimes there is code like GetAllOrders().Count() > 0 or such.

Code like this is extremely inefficient because GetAllOrders() fetches * all records from the database, only to load them into memory in the application to later be filtered down or counted etc.

If you're following a path to abstract away the EF DbContext and entities into a service / repository through methods then you should ensure that the service exposes methods to produce efficient queries, or forgo the abstraction and leverage the DbContext directly where data is needed.

var orderIdsToShip = context.Orders
    .Where(o => o.OrderStatus == OrderStatus.Pending)
    .Select(o => o.OrderId)
    .ToList();


var customerOrderCount = context.Customer
    .Where(c => c.CustomerId == customerId)
    .Select(c => c.Orders.Count())
    .Single();

EF is extremely powerful and when selected to service your application should be embraced as part of the application to give the maximum benefit. I recommend avoiding coding to abstract it away purely for the sake of abstraction unless you are looking to employ unit testing to isolate the dependency on data with mocks. In this case I recommend leveraging a unit of work wrapper for the DbContext and the Repository pattern leveraging IQueryable to make isolating business logic simple.

LINQ Query optimalisation using EF6

Question

1 answers

solution1
0 ACCPTED 2021-04-19 21:52:08

LINQ Query optimalisation using EF6

Question

1 answers

solution1 0 ACCPTED 2021-04-19 21:52:08

solution1
0 ACCPTED 2021-04-19 21:52:08