简体   繁体   中英

Does the order of conditions make a performance difference in MySQL?

Suppose I have a MySQL query like this, the table PEOPLE has about 2 million rows:

SELECT * FROM `PEOPLE` WHERE `SEX`=1 AND `AGE`=28;

The first condition will return 1 million rows, and the second condition may return 20,000 rows. From the local website, most developers said that it will cause a better affect to change the order of them. And they also said that It will cause a 2 million + 1 million + *10,000* I/O time if change the order, while original query above will cause a 2 million + 20,000 + *10,000* I/O time. It sounds make sense.

As we all know that MySQL has an internal query optimizer for such work. Does the order needs pay particular attention for optimal performance? I was totally confused.

PS: I noticed that there are some similar question asked already, but they are two or tree years ago, it seems better to ask again.


Thanks all noticed this question. This is a explain about why i ask again:

Before I ask this question, I run EXPLAIN for a couple of times. The answer is the order doesn't matter. But the Interviewer told me the order will make a difference performance, I want make it sure if there is something i missing.

You should first understand a fundamental thing: in theory, a relational database does not have indices .

A purely theoretical relational database engine would indeed scan all records, check the criterion on the sex and age columns and only return the relevant rows.

However, indices are a common layer added by SQL database engines to filter rows faster. In this case, you should have indices for both of these columns.

What is more, these same database engines perform analysis on these indices ( if any ) to determine the best possible course of action to retrieve the relevant rows faster. In particular, one criterion in index metadata is cardinality : for a given value of the indexed column, how many rows match on average? The higher the number of rows, the lower the cardinality. Therefore, the higher the cardinality the better.

Therefore, an SQL engine's query optimizer will certainly select to cut through the result set by looking up the age index first, and only then the index of sex . And it may even choose not to use the index on sex at all if it determines that it can be faster by just looking up the sex column value for each row resulting from the first filter. Which is likely here, since the cardinality of the sex column is ridiculously low.

Have a look here for an introduction to the relational model.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM