简体   繁体   中英

SQL best practice to deal with default sort order

A lot of SQL code I've read, it seems like the developer assumes that the default sort order always holds. For example when building an HTML select list they would just SELECT id, name FROM table without issuing an ORDER BY clause.

From my own experience it seems like dbms alway orders data using FIFO if no ORDER BY clause is given and no index. However, the order is not guaranteed. But I have never seen a dbms reordering data if there no change to the table.

Have you ever experienced a dbms selecting data in a non deterministic order if there is no change to the table?

Is it best practice to always put an ORDER BY clause?

There is no default sort order. Even if the table has a clustered index, you are not guaranteed to get the results in that order. You must use an order by clause if you want a specific order.

As the other posters mention, if you don't specify a sort order, the SQL standard says the results can be in whatever order the query processor finds most expedient and efficient.

Let's say you do a simple unordered SELECT for all the rows of a CUSTOMER table, which has no indexes and no primary key. It's quite possible, and even likely, that the query processor will do a straight table scan and produce the rows in the order they were originally inserted (giving you the FIFO behavior you saw).

If you then add an index on the STATE and CITY fields (in that order), and then query for WHERE STATE = 'NY' the query processor may decide it's more efficient to scan the index entries for STATE = 'NY' rather than to do a full table scan. In this case it would probably materialize the rows in STATE, CITY order.

Even this is not certain. For example if the query processor has gathered statistics that show that nearly all the STATE values in your table are 'NY' (maybe because the database is for an Albany-based equipment rental business), it may decide that the table scan is actually cheaper than the index scan, and you'll see FIFO again.

It's a good idea to learn some basics about how your database plans its queries. You can use the EXPLAIN statement to see how your DBMS would execute any given query, and then use this to optimize your query, in some cases by orders of magnitude. This is a fascinating and useful area to learn.

如果您希望数据按顺序排列,是的 - 您必须使用ORDER BY

Yes. There is no "default order" without an ORDER BY, and there's no guarantee that you'll get the data back in FIFO/LIFO or any other order.

As far as the developers using "SELECT id, name FROM table", they're either inept or they don't care what order anything appears in.

No serious RDBMS guarantees any order unless you specify an explicit ORDER BY.

Anything else is just pure luck or anectodal - if you want order, you have to specify ORDER BY - no way around that.

If you want the data ordered, the only way to guarantee anything (with every major RDBMS system that I'm aware of, definitely Sql Server and Oracle) is to include an ORDER BY clause. FIFO has absolutely nothing to do with the order data is returned without an ORDER BY clause, and there isn't a concept of any kind of DEFAULT sort order. The so called DEFAULT sort order is basically however the engine gets the data, which could be in literally any order based on indexes, cached data, simultaneous executing queries, load on the server, etc., etc.

This other stackoverflow thread is basically covering the same concept in relation to Sql Server, AlexK blogged a repo to demonstrate the behavior.

Even a simple query like SELECT ... FROM table can return data in various order. I know this to be true in theory, I know this to be true in practice, and I have seen plenty of cases when the order changes between subsequent executions, even when no data change occurs in the table.

A typical example of order changes between executions is when the query is executed using a parallel plan. Since parallel operators return data as the underlying threads produce it, the order of the rows in the result varies between each run. This situation makes even the simple SELECT in your example return wildly different results each time is run.

In my experience with SQL, most of the time I do not specify a ORDER BY in SQL, because the record sets are displayed in a " client-side " grid type control etc. where dynamic sorting is supported - in this case ordering by SQL is needless as it will be checked client side anyway.

This is also done client side because the same query might be used to display the data in different places in different orders.

Therefore it is only best practice to put in an ORDER BY, when

  • The order of the data IS important; and
  • The sorting is more efficient at the DB level.

ie if the front end developer is going to be "re-sorting" it anyway, then there is no point, as it unlikely to save overall processing time.

Perhaps the writers of those SQL queries you're reading don't care about the order of the data returned. The best practice is to use it where you need to ensure the order of the results returned!

I'm writing this in case if someone would like to use this as I did.

Well, I'm getting satisfactory default sort order, let's say for log tables, with sort on Index. For example I'm usually interested in last rows of log table (LIFO) so I make DateTime DESC as order. I also tried for fun to add Index on the other field (integer) beside Primary key and it worked.

CREATE TABLE [dbo].[tableA]([DateTime] [datetime] NOT NULL,
CONSTRAINT [PK_tableA] 
PRIMARY KEY CLUSTERED ([DateTime] DESC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY]

Or in SSMS ...

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM