简体   繁体   中英

Entity Framework Take statement, cause EF generates unefficient SQL query

I'm using Take and Skip statements in EF CodeFirst, for implementing Paging(as Zoran Maksimovic said in this post ), these statements cause EF generate a sql query like this(my page size is 100):

 SELECT TOP (100) [Filter1].[Id]                AS [Id],
                  [Filter1].[SendDuration]      AS [SendDuration]
        FROM   (SELECT [Extent1].[Id] AS [Id],               
                       [Extent1].[SendDuration] AS [SendDuration],
                       row_number() OVER (ORDER BY [Extent1].[SendDuration] DESC) AS [row_number]
                       FROM   [dbo].[MyView] AS [Extent1]
                       WHERE  (1293>= [Extent1].[Id])
                )AS [Filter1]
                WHERE  [Filter1].[row_number] > 500
                ORDER  BY [Filter1].[SendDuration] DESC

but this sql is so slow, when run in sql server, but as Boanerge said in his comment using row_number < X instead of Top(y) cause performance increase. i mean that if i change the generated sql to:

     SELECT  [Filter1].[Id]                AS [Id],
                  [Filter1].[SendDuration]      AS [SendDuration]
        FROM   (SELECT [Extent1].[Id] AS [Id],               
                       [Extent1].[SendDuration] AS [SendDuration],
                       row_number() OVER (ORDER BY [Extent1].[SendDuration] DESC) AS [row_number]
                       FROM   [dbo].[MyView] AS [Extent1]
                       WHERE  (1293>= [Extent1].[Id])
                )AS [Filter1]
                WHERE  [Filter1].[row_number] > 500 and [Filter1].[row_number] <= 600
                ORDER  BY [Filter1].[SendDuration] DESC

query execution time will be better and more acceptable(in some cases 4 or 5 time more faster). Is there any way to force EF to generate 2nd Sql instead 1st Sql?

Without seeing your full schema etc. it's hard to be certain but to me this indicates a missing index 1 .

Run the query with include actual execution plan and see what SSMS tells you - it often suggest an extra index.

I've got one dataset that is millions of records - and spent about 3 days optimising each query using LINQpad, SSMS and detailed analysis of the execution plan. In the end I managed to save around 96% execution time just by adding 3 extra indexes.


1 It's a clue that indexing is required/would help when you can restrict the range and improve the performance - as that's indicating that a scan is being performed and we don't like scans...

This is the double edge of algorithmic query generation: the query is only as good as the algorithm. The only way for that to change is if the algorithm decides to do something different - perhaps with a different version of the ORM tool. Of course, "different" is not always the same as "better" :)

For non-trivial queries, an experienced SQL developer will - when armed with the right tools (SSMS provides most of what you would need) - often be able to out-perform many generated queries. Of course, a less experienced SQL developer may end up with the query doing the wrong thing .

Most ORMs offer the option to pass a raw SQL query into the engine - this does lose portability, though. If that is a problem, another option is things like stored procedures, where the API could be similar even on different implementations on different databases.

The questions you need to ask here are:

  • do I genuinely target more than one database provider?
  • am I happy to own the ongoing maintenance of this query?

If the answers as "no" and "yes", you should be ok to just pass a raw SQL (presumably T-SQL) query to EF.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM