简体   繁体   中英

TOP clause slowing down query

I have 2 queries: one with a TOP clause and one without. The result is the exact same yet the one with the TOP clause is significantly slower. Why would that be?

Server: Microsoft SQL Server 2008 R2 (SP2)

Query 1 - Regular

insert into #Buffer (details, persistentID, productID, [date])
select de.details, lg.databaseID % 1000, lg.productID, lg.readDateTime
from Log lg with (nolock)
join LogDetails de with (nolock)
    on lg.logID = de.logID
where @startDate <= readDateTime and readDateTime < @endDate

Query 2 - TOP clause

insert into #Buffer (details, persistentID, productID, [date])
select top (@count) de.details, lg.databaseID % 1000, lg.productID, lg.readDateTime
from Log lg with (nolock)
join LogDetails de with (nolock)
    on lg.logID = de.logID
where @startDate <= readDateTime and readDateTime < @endDate

Note @count is the size of the result set.

Oddly, the CPU time of Query 2 is half that of Query 1, and yet the elapsed time of Query 2 is 3 times that of Query 1.

Based on this : Inside the Optimizer: Row Goals In Depth from Paul White

When you present a query to SQLServer,it assumes, you will be consuming all rows produced by the query .But some times ,when you introduce TOP and EXISTS operators ,SQL server will try to find the First Row as soon as possible,this some times will lead to lesser optimal plan.In your case,it lead to Nested Loops Plan..

You may ask why this row Goal ,can't be optimized..Below is the explanation from Paul White for the same..

The challenges involved in producing an optimised query plan for row-limited queries, while retaining good general optimisation performance for full-result queries, are more complex than simply replacing hash or merge join iterators with nested loops. It would be reasonably straightforward to cater for queries with a TOP at the root of a plan, using specific code designed to recognise specific scenarios. However, that approach would miss wider opportunities for plan optimisation in more general cases.

The TOP clause can be specified multiple times, in multiple places in a query declaration: in the outermost scope (as in the example); in a sub-query; or in a common table expression – all of which may be arbitrarily complex . The FAST 'n' query hint can also be used to ask the optimiser to prefer a plan which will produce the first 'n' rows quickly, while not restricting the total number of rows returned overall, as is the case with TOP. As a final example, consider that a logical semi-join (such as a sub-query introduced with EXISTS) shares the overall theme: it should be optimised to find the first matching row quickly.

The SQL Server query optimiser provides a way to meet all these requirements by introducing the concept of a 'row goal', which simply establishes a number of rows to 'aim for' at a particular point in the plan.

So in your case,to overcome this ROWGOAL Limitation,you can rewrite query like below,by using a hint

insert into #ViewCountBuffer (details, persistentID, productID, [date])
 select top (@count) details, databaseID % 1000, productID, readDateTime
 from Log ld with (nolock)
 HASH join LogDetails de with (nolock)
  on ld.logID = de.logID
 where @startDate <= readDateTime and readDateTime < @endDate

Below are related threads on stack Exchange on same topic,..

https://dba.stackexchange.com/questions/157353/wrapping-query-in-if-exists-makes-it-very-slow

https://dba.stackexchange.com/questions/126235/if-exists-taking-longer-than-embedded-select-statement

https://dba.stackexchange.com/questions/24832/how-and-why-does-top-impact-an-execution-plan

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM