简体   繁体   中英

Why does the query take a long time in mysql even with a LIMIT clause?

Say I have an Order table that has 100+ columns and 1 million rows. It has a PK on OrderID and FK constraint StoreID --> Store.StoreID.

1) select * from 'Order' order by OrderID desc limit 10;

the above takes a few milliseconds.

2) select * from 'Order' o join 'Store' s on s.StoreID = o.StoreID order by OrderID desc limit 10;

this somehow can take up to many seconds. The more inner joins I add, slows it down further more.

3) select OrderID, column1 from 'Order' o join 'Store' s on s.StoreID = o.StoreID order by OrderID desc limit 10;

this seems to speed the execution up, by limiting the columns we select.

There are a few points that I dont understand here and would really appreciate it if anyone more knowledgeable with mysql (or rmdb query execution in general) can enlighten me.

Query 1 is fast since it's just a reverse lookup by PK and DB only needs to return the first 10 rows it encountered.

I don't see why Query 2 should take for ever. Shouldn't the operation be the same? ie get the first 10 rows by PK and then join with other tables. Since there's a FK constraint, it is guaranteed that the relationship will be satisfied. So DB doesn't need to join more rows than necessary and then trim the result, right? Unless, FK constraint allows null FK? In which case I guess a left join would make this much faster than an inner join?

Lastly, I'm guess query 3 is simply faster because less columns are used in those unnecessary joins? But why would the query execution need the other columns while joining? Shouldn't it just join using PKs first, and then get the columns for just the 10 rows?

Thanks!

My understanding is that the mysql engine applies limit after any join 's happen.

From http://dev.mysql.com/doc/refman/5.0/en/select.html , The HAVING clause is applied nearly last, just before items are sent to the client, with no optimization. (LIMIT is applied after HAVING.) The HAVING clause is applied nearly last, just before items are sent to the client, with no optimization. (LIMIT is applied after HAVING.)

EDIT: You could try using this query to take advantage of the PK speed.

select * from (select * from 'Order' order by OrderID desc limit 10) o join 'Store' s on s.StoreID = o.StoreID;

All of your examples are asking for tablescans of the existing tables, so none of them will be more or less performant than the degree to which mysql can cache the data or results. Some of your queries have order by or join criteria, which can take advantage of indexes purely to make the joining process more efficient, however, that still is not the same as having a set of criteria that will trigger the use of indexes.

Limit is not a criteria -- it can be thought of as filtration once a result set is determined. You save time on the client, once the result set is prepared, but not on the server.

Really, the only way to get the answers you are seeking is to become familiar with: EXPLAIN EXTENDED your_sql_statement

The output of EXPLAIN will show you how many rows are being looked at by mysql, as well as whether or not any indexes are being used.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM