简体   繁体   中英

How to avoid a 'where' clause affecting row ordering?

I have a case where I do a select from another select and the order of the returned rows is changed if I add a where clause.

Example:

SELECT t.id
FROM (
       SELECT t.id
       FROM table1 t
       ORDER BY
         t.viewsTotal ASC
       LIMIT 20
       OFFSET 0
     ) base
  INNER JOIN table1 t ON base.id = t.id
  LEFT JOIN table2 t2 ON t2.id = t1.secondTableId
# WHERE t2.someBoolColumn = FALSE
;

Now, the order is the same for the inner select and the outer select , but if I uncomment the where condition, the outer select will change the ordering.

How can I prevent this from happening?

Lets assume the following for a given example:

  1. I can not do one select .
  2. I do not know what order has been applied to an inner select when doing an outer select . So, if I order from a joined table, I wouldn't know that I need to join it here.

More info on my use case

There is a query builder that provides inner select, and it may apply order by a third table that is joined to that inner select, if i would like to apply the same order i would need to know what tables were joined, and in the case of this poor query builder i do not have that knowledge

tl;dr If you want a particular order in your result set, use ORDER BY .

The ordering of rows in a result set from any RDMS server without an ORDER BY clause is formally unpredictable. Unpredictable is like random, except worse. Random ordering implies you'll get your rows in a different order every time you run the query. Truly random ordering, if it existed, would make it hard for simple unit tests to pass when your assumptions about ordering fail.

Unpredictable means you'll get them in the same order, until you don't. That means your unit tests will pass, and your system tests will pass, and your system will fail six months into production, if it depends on result set ordering.

Why is this so? A server's query planner is free to use any algorithm at its disposal to satisfy the queries you give it. These algorithms work differently for different types of table and different sizes of table. If you don't constrain the query planner by specifying the result set ordering, it may pick some algorithm that gives an ordering that appears strange to you the programmer.

Query planners have, literally, thousands of programmer years' worth of optimizations built in to them.

For people used to the procedural ways of thinking encouraged by all kinds of programming languages, it's sometimes hard to switch your thinking to the declarative / descriptive mode used by SQL. With SQL (at least clean SQL without stuff like SELECT @a := @a+1 and other hacks) you're simply describing the result set you want. The server generates results matching your specification.

I would suggest you not rely on the implicit ordering produced my SQL (because there is no implicit ordering as per Bohemian's comment). Rather, you should use an ORDER BY statement and select one of your columns in the query by which you should order your results. That way you can ensure that the results are always presented in the same way regardless of the WHERE clauses.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM