简体   繁体   中英

Multiple queries and poor perfomance in Entity Framework when using Distinct and Order By

The following LINQ query in EF (EFCore, v2.2.1)

    var x = context.Exchange 
            .Include(q => q.Input)
            .Where(q => q.InputId !=  1&& 
                q.Input.CreatedOnUtc > DateTime.Parse("2019-11-25") && 
                q.Input.UserId == 2 && 
                q.BotConversationId == 3)
            .Distinct()
            .OrderBy(q => q.Input.CreatedOnUtc)
            .FirstOrDefault()

Ends up giving the profiled SQL results (simplified)

select * from (
    select distinct e.* 
    from Exchange e, ExchangeInput i 
    where e.InputId = i.InputId
    and e.InputId <> 1
    and i.UserId = 2
    and e.BotConversationId = 3
)

select * from ExchangeInput

Why does it need to do two separate queries? The second query being horrendous when ExchangeInput might have millions of rows. Surely, this would suffice:

select * from (
    select distinct e.*, i.CreatedOnUtc 
    from Exchange e, ExchangeInput i 
    where e.InputId = i.InputId
    and e.InputId <> 1
    and i.UserId = 2
    and e.BotConversationId = 3
) a
order by a.CreatedOnUtc 

Also - putting the Distinct after the order by gives only 1 query as I would expect.

Fixing the problem is easy enough. Adding a .Select(...) before the .Distinct or removing the .Distinct() will do it. But the initial, poorly performing code, doesn't seem immediately problematic when reviewing it.

I would start by suggesting that calling Distinct() before a FirstOrDefault() is unnecessary. The first row in a "non-distinct" query should always be the same as a "distinct" query when you have an OrderBy! As you mentioned in your last sentence, it seems that removing the Distinct() should only create one query.

Separate to your question, I would also suggest calculating DateTime.Parse("2019-11-25") outside of the query. That should allow you to pass it to the database server as a parameter and that might make your query even more efficient.

All in all, I would try:

var dateFilter = DateTime.Parse("2019-11-25");
var x = context.Exchange 
        .Include(q => q.Input)
        .Where(q => q.InputId != 1 && 
            q.Input.CreatedOnUtc > dateFilter && 
            q.Input.UserId == 2 && 
            q.BotConversationId == 3)
        .OrderBy(q => q.Input.CreatedOnUtc)
        .FirstOrDefault()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM