[英]MySQL query faster in DESC order than ASC order
I made a simple database (innodb version 5.7.9) with 2 tables, post and post_tag. 我制作了一个简单的数据库(innodb版本5.7.9),其中包含2个表post和post_tag。
Post has a single field ID (big int) set as Primary key (about 120,000 entries). 帖子具有设置为主键的单个字段ID(大整数)(约120,000个条目)。 Post_tag has 2 fields, post_id (big int) and tag_id (int), and the primary key is on [post_id, tag_id].
Post_tag有2个字段,post_id(大整数)和tag_id(整数),主键位于[post_id,tag_id]上。
The following query runs in ~1ms: 以下查询在〜1ms内运行:
SELECT
SQL_NO_CACHE p.id
FROM
post as p
STRAIGHT_JOIN
post_tag t
WHERE
t.post_id = p.id AND t.tag_id = 25
ORDER BY
p.id DESC
LIMIT 0, 100
But if I change the ORDER BY to ASC, it runs about 100 times slower! 但是,如果我将ORDER BY更改为ASC,它的运行速度将慢100倍! And that the sort I am interested into...
我感兴趣的那种...
Any idea why? 知道为什么吗?
Initially, I wanted the IDs sorted DESC, and I noticed it was slower than ASC. 最初,我希望ID对DESC进行排序,但我发现它比ASC慢。 I read that the natural sort for index is ASC, so I reverted all the ID (by doing ID = SOMETHING BIG - ID), but then it didn't changed anything as it's now slower in ASC.
我读到,索引的自然排序是ASC,所以我还原了所有ID(通过执行ID = SOMETHING BIG-ID),但是由于它现在在ASC中的速度较慢,因此它没有任何改变。
I uploaded the database here in case it's useful. 如果有用,我在这里上传了数据库。
Many thanks in advance to anyone who can help. 在此先感谢任何可以提供帮助的人。
If there are "other constraints", then all bets are off. 如果有“其他限制”,则所有下注均无效。
Meanwhile, looking at what you have... 同时,看看你有什么...
STRAIGHT_JOIN
, USE INDEX
, etc, are crutches for when (a) you don't have the 'right' index, or (b) the optimizer can't figure out the 'right' thing to do. STRAIGHT_JOIN
, USE INDEX
等是以下情况的拐杖:(a)您没有“正确”的索引,或者(b)优化器无法确定“正确”的事情。 That is, look for other solutions. 也就是说,寻找其他解决方案。
In your example, you would be better of with a plain JOIN
and INDEX(tag_id, post_id)
. 在您的示例中,最好使用普通的
JOIN
和INDEX(tag_id, post_id)
。 This would let it go to post_tag
first since there is a WHERE
clause letting it filter there. 这将
post_tag
首先进入post_tag
因为有一个WHERE
子句可对其进行过滤。 The optimizer will probably see that t.post_id
and p.id
are identical, so start a the end (for DESC
) of (25, post_id)
in the index, and scan. 优化器可能会看到
t.post_id
和p.id
相同,因此在索引中以(25, post_id)
的结尾(对于DESC
)开始,然后进行扫描。 It then checks to see if there is a post
entry (that being the only apparent use for post
-- again if there are "other constraints", all bets are off). 然后检查,看看是否有一个
post
进入(这是对于唯一明显的使用post
-如果再有“其他方面的限制”,所有的赌注都关闭)。
So, back to the original question. 所以,回到原来的问题。
STRAIGHT_JOIN
forced looking in post
first. STRAIGHT_JOIN
强制先查找post
。 But where are the 25s? 但是25年代在哪里? Apparently near the end of
post_tag
. 显然接近年底
post_tag
。 Hence, ASC
took longer to find 100 (see LIMIT
) of them than if the scan started at the other end! 因此,与从另一端开始扫描相比,
ASC
需要更长的时间才能找到其中的100个(请参阅LIMIT
)!
Assuming this is a many-to-many mapping table, do this: 假设这是一个多对多映射表,请执行以下操作:
CREATE TABLE post_tag (
post_id ...,
tag_id ...,
PRIMARY KEY(post_id, tag_id),
INDEX (tag_id, post_id)
) ENGINE=InnoDB;
I discuss the many reasons in my blog . 我在博客中讨论了许多原因。
If, as was suggested, you add (tag_id, post_id DESC)
, don't be deluded into thinking that the DESC
means anything -- it is recognized, but ignored. 如果按照建议的方式添加
(tag_id, post_id DESC)
,请不要(tag_id, post_id DESC)
以为DESC
意味着什么-它可以识别,但可以忽略。 Both parts will be stored ASC
. 这两部分都将存储为
ASC
。 What will happen is that the Optimizer is smart enough to start at the end of the 25s and scan backward. 将会发生的事情是,优化器足够聪明,可以在25秒结束时开始并向后扫描。 Here's "proof":
这里是“证明”:
US
has INDEX(state, population)
: US
INDEX(state, population)
:
mysql> FLUSH STATUS;
mysql> SELECT city, population FROM US
WHERE state = 'OH'
ORDER BY population DESC LIMIT 5;
+------------+------------+
| city | population |
+------------+------------+
| Columbus | 736836 |
| Cleveland | 449514 |
| Toledo | 306974 |
| Cincinnati | 306382 |
| Akron | 208414 |
+------------+------------+
mysql> SHOW SESSION STATUS LIKE 'Handler%';
| Handler_read_key | 1 | -- get started at end of Ohio
| Handler_read_prev | 4 | -- read (5-1) more, scanning backwards
The only case where MySQL is missing the boat by ignoring DESC
in an INDEX
declaration is: ORDER BY a ASC, b DESC
cannot use INDEX(a,b)
. MySQL通过忽略
INDEX
声明中的DESC
来丢失船只的唯一情况是: ORDER BY a ASC, b DESC
无法使用INDEX(a,b)
。
Presumably, you have an index on post(id)
(this is created automatically for primary keys, for instance). 大概您在
post(id)
上有一个索引(例如,它是为主键自动创建的)。 MySQL sometimes pays attention to the order of the index when using an index for ORDER BY
. 当对
ORDER BY
使用索引时,MySQL有时会注意索引的顺序。
By changing the order, you are changing the query plan in such a way that sorting is necessary. 通过更改顺序,您将以需要排序的方式更改查询计划。
I would suggest writing the query using only one table: 我建议仅使用一个表来编写查询:
SELECT t.post_id
FROM post_tag t
WHERE t.tag_id = 25
ORDER BY t.post_id DESC
LIMIT 0, 100;
The JOIN
is not necessary for this query, assuming that all values of post_id
refer to valid posts (which seems like a very reasonable assumption). 假定
post_id
所有值都引用有效的帖子(这似乎是一个非常合理的假设),则此查询不需要JOIN
。
For this query, an index on post_tag(tag_id, post_id desc)
is optimal, and MySQL might do the right thing for a descending sort. 对于此查询,
post_tag(tag_id, post_id desc)
上的索引是最佳的,MySQL对于降序排序可能会做正确的事情。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.