简体   繁体   中英

SQL query that can select n rows order by and then return m row

I need to create SQL query that can select lets say 1000 rows orders by one column and then returns only 100 rows.

Why? My query can select ~1 000 000 row (or more) and I want to take first 1000 and from this 1000 rows show only 100 with the best relevance. I'm afraid about performance of such a select so I want to introduce this first step (take only 1000 rows). I know that I might miss document with better relevance but in this case it doesn't matter.

Does it matter if you don't select the first 1000? ie if you just use...

Select top 100 *
From table
Order by column

You get the same result and as pointed out elsewhere you won't are likely to degrade performance rather than improve it.

If you want to optimize this query then ensure that there is an index on the column then the SQL server will be able to optimise the retrieval and sorting of the records to give you just what you want.

I think I finally understand what you are trying to get at, but it appears you are very confused about how databases perform ordering operations.

If I understand you correctly, you are concerned about the performance impact of sorting a large number of rows (1,000 in your example, though that is NOT a large number of rows). So you are trying to outsmart it by only making it sort the 100 rows you are interested in.

If you apply a where clause to limit it to 100 rows, in most cases a modern DB system will automatically hold off on performing the sort until after it has narrowed down the results to avoid doing extra work. This is not true 100% of the time, but when the DB optimizer decides to sort first, it usually has a VERY good reason based on performance or because the query has identified a condition whereby the sort must be performed first to get accurate results.

The trick is that you have to understand that tSQL is a Declarative language not a procedural one. That is, you use the language to describe what you want, and the optimizer figures out the exact algorithm to make that happen. It appears that you are trying to optimize your code the way you would as if you were writing a procedural language like C# or Java. SQL translates your query into code, it doesn't run it as you type it.

Long story short, the DB engines are extremely good at this type of simple optimization (and some very complex ones). You aren't going to out-optimize the optimizer with gimmicks like this, so don't even bother. You aren't going to get more performance and depending on how you write the query, you could actually degrade it.

The literal interpretation would lead to

select top 1000 from tbl order by columnname

And the next step to

SELECT TOP 100 FROM (select top 1000 from tbl order by columnname) SQ

But that gives no different than a direct

select top 100 from tbl order by columnname

Unless you are after 2 different orderings

SELECT TOP 100
FROM (
   select top 1000 from tbl
   order by columnname) SQ
ORDER BY othercolumn

or switching between asc/desc

SELECT TOP 100
FROM (
   select top 1000 from tbl
   order by columnname ASC) SQ
ORDER BY columnname DESC

You could use a subquery. Something like:

select top 100 * from (
    select * from tablename
    limit 1000
)
order by fieldname

My SQL is a bit rusty so the syntax might be off a bit, and there may be a better way to do it depending on the platform you're working with, but hopefully this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM