简体   繁体   English

为什么即使使用LIMIT子句,查询在mysql中也需要很长时间?

[英]Why does the query take a long time in mysql even with a LIMIT clause?

Say I have an Order table that has 100+ columns and 1 million rows. 假设我有一个包含100多列和100万行的Order表。 It has a PK on OrderID and FK constraint StoreID --> Store.StoreID. 它在OrderID和FK约束StoreID上有一个PK - > Store.StoreID。

1) select * from 'Order' order by OrderID desc limit 10; 1) select * from 'Order' order by OrderID desc limit 10;

the above takes a few milliseconds. 以上需要几毫秒。

2) select * from 'Order' o join 'Store' s on s.StoreID = o.StoreID order by OrderID desc limit 10; 2) select * from 'Order' o join 'Store' s on s.StoreID = o.StoreID order by OrderID desc limit 10;

this somehow can take up to many seconds. 这种方式可能需要很长时间。 The more inner joins I add, slows it down further more. 我添加的内连接越多,就越慢。

3) select OrderID, column1 from 'Order' o join 'Store' s on s.StoreID = o.StoreID order by OrderID desc limit 10; 3) select OrderID, column1 from 'Order' o join 'Store' s on s.StoreID = o.StoreID order by OrderID desc limit 10;

this seems to speed the execution up, by limiting the columns we select. 这似乎通过限制我们选择的列来加快执行速度。

There are a few points that I dont understand here and would really appreciate it if anyone more knowledgeable with mysql (or rmdb query execution in general) can enlighten me. 有几点我在这里不明白,如果有更多知识的mysql(或一般的rmdb查询执行)可以启发我,我会非常感激。

Query 1 is fast since it's just a reverse lookup by PK and DB only needs to return the first 10 rows it encountered. 查询1很快,因为它只是PK的反向查找,而DB只需返回它遇到的前10行。

I don't see why Query 2 should take for ever. 我不明白为什么查询2应该永远采用。 Shouldn't the operation be the same? 操作不应该一样吗? ie get the first 10 rows by PK and then join with other tables. 即通过PK获取前10行, 然后与其他表连接。 Since there's a FK constraint, it is guaranteed that the relationship will be satisfied. 由于存在FK约束,因此可以保证满足关系。 So DB doesn't need to join more rows than necessary and then trim the result, right? 因此DB不需要连接超过必要的行,然后修剪结果,对吗? Unless, FK constraint allows null FK? 除非,FK约束允许空FK? In which case I guess a left join would make this much faster than an inner join? 在这种情况下,我猜左连接会比内连接快得多吗?

Lastly, I'm guess query 3 is simply faster because less columns are used in those unnecessary joins? 最后,我猜测查询3的速度更快,因为在这些不必要的连接中使用了更少的列? But why would the query execution need the other columns while joining? 但是为什么在加入时查询执行需要其他列? Shouldn't it just join using PKs first, and then get the columns for just the 10 rows? 它不应该首先使用PK加入,然后只获取10行的列?

Thanks! 谢谢!

My understanding is that the mysql engine applies limit after any join 's happen. 我的理解是,在任何join发生后,mysql引擎都会应用limit

From http://dev.mysql.com/doc/refman/5.0/en/select.html , The HAVING clause is applied nearly last, just before items are sent to the client, with no optimization. (LIMIT is applied after HAVING.) http://dev.mysql.com/doc/refman/5.0/en/select.html开始The HAVING clause is applied nearly last, just before items are sent to the client, with no optimization. (LIMIT is applied after HAVING.) The HAVING clause is applied nearly last, just before items are sent to the client, with no optimization. (LIMIT is applied after HAVING.)

EDIT: You could try using this query to take advantage of the PK speed. 编辑:您可以尝试使用此查询来利用PK速度。

select * from (select * from 'Order' order by OrderID desc limit 10) o join 'Store' s on s.StoreID = o.StoreID;

All of your examples are asking for tablescans of the existing tables, so none of them will be more or less performant than the degree to which mysql can cache the data or results. 您的所有示例都要求现有表的表扫描,因此它们中的任何一个都不会比mysql可以缓存数据或结果的程度更高或更低。 Some of your queries have order by or join criteria, which can take advantage of indexes purely to make the joining process more efficient, however, that still is not the same as having a set of criteria that will trigger the use of indexes. 您的一些查询具有order by或join条件,这可以充分利用索引来使连接过程更有效,但是,这仍然与具有一组将触发索引使用的标准不同。

Limit is not a criteria -- it can be thought of as filtration once a result set is determined. 限制不是标准 - 一旦确定结果集,就可以将其视为过滤。 You save time on the client, once the result set is prepared, but not on the server. 一旦准备好结果集,您就可以节省客户端上的时间,但不能在服务器上节省时间。

Really, the only way to get the answers you are seeking is to become familiar with: EXPLAIN EXTENDED your_sql_statement 真的,获得你想要的答案的唯一方法就是熟悉:EXPLAIN EXTENDED your_sql_statement

The output of EXPLAIN will show you how many rows are being looked at by mysql, as well as whether or not any indexes are being used. EXPLAIN的输出将显示mysql正在查看的行数,以及是否正在使用任何索引。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM