简体   繁体   English

ORDER BY ... ASC很慢并且“使用索引条件”

[英]ORDER BY … ASC is slow and “Using index condition”

I have 2 tables: user and post . 我有2个表: userpost

With show create table statements: 使用show create table语句:

CREATE TABLE `user` (
  `user_id` bigint(20) NOT NULL AUTO_INCREMENT,
  `user_name` varchar(20) CHARACTER SET latin1 NOT NULL,
  `create_date` datetime DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=59 DEFAULT CHARSET=utf8;

CREATE TABLE `post` (
  `post_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `owner_id` bigint(20) NOT NULL,
  `data` varchar(300) CHARACTER SET latin1 DEFAULT NULL,
  PRIMARY KEY (`post_id`),
  KEY `my_fk` (`owner_id`),
  CONSTRAINT `my_fk` FOREIGN KEY (`owner_id`) REFERENCES `user` (`user_id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1012919 DEFAULT CHARSET=utf8;

Everything is fine util I execute 2 queries with ORDER BY statement and the result is very strange, ASC is slow but DESC is very fast. 一切都很好我使用ORDER BY语句执行2个查询,结果非常奇怪, ASC很慢但DESC非常快。

SELECT sql_no_cache * FROM mydb.post where post_id > 900000 and owner_id = 20 order by post_id desc limit 10;
10 rows in set (0.00 sec)

SELECT sql_no_cache * FROM mydb.post where post_id > 900000 and owner_id = 20 order by post_id asc limit 10;
10 rows in set (0.15 sec)

Then I use explain statements: 然后我使用explain语句:

explain SELECT sql_no_cache * FROM mydb.post where post_id > 900000 and owner_id = 20 order by post_id desc limit 10;
+----+-------------+-------+------+---------------+-------+---------+-------+--------+-------------+
| id | select_type | table | type | possible_keys | key   | key_len | ref   | rows   | Extra       |
+----+-------------+-------+------+---------------+-------+---------+-------+--------+-------------+
|  1 | SIMPLE      | post  | ref  | PRIMARY,my_fk | my_fk | 8       | const | 239434 | Using where |
+----+-------------+-------+------+---------------+-------+---------+-------+--------+-------------+
1 row in set (0.01 sec)


explain SELECT sql_no_cache * FROM mydb.post where post_id > 900000 and owner_id = 20 order by post_id asc limit 10;
+----+-------------+-------+------+---------------+-------+---------+-------+--------+------------------------------------+
| id | select_type | table | type | possible_keys | key   | key_len | ref   | rows   | Extra                              |
+----+-------------+-------+------+---------------+-------+---------+-------+--------+------------------------------------+
|  1 | SIMPLE      | post  | ref  | PRIMARY,my_fk | my_fk | 8       | const | 239434 | Using index condition; Using where |
+----+-------------+-------+------+---------------+-------+---------+-------+--------+------------------------------------+
1 row in set (0.00 sec)

I think the point is Using index condition but I don't know why. 我认为重点是Using index condition但我不知道为什么。 How can I improve my database for better performance? 如何改进数据库以获得更好的性能?

UPDATE: 更新:

explain SELECT * FROM mydb.post where post_id < 600000 and owner_id = 20 order by post_id desc limit 10;
+----+-------------+-------+------+---------------+-------+---------+-------+--------+-------------+
| id | select_type | table | type | possible_keys | key   | key_len | ref   | rows   | Extra       |
+----+-------------+-------+------+---------------+-------+---------+-------+--------+-------------+
|  1 | SIMPLE      | post  | ref  | PRIMARY,my_fk | my_fk | 8       | const | 505440 | Using where |
+----+-------------+-------+------+---------------+-------+---------+-------+--------+-------------+


explain SELECT * FROM mydb.post where post_id < 600000 and owner_id > 19 and owner_id < 21 order by post_id desc limit 10;
+----+-------------+-------+-------+---------------+---------+---------+------+--------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows   | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+-------------+
|  1 | SIMPLE      | post  | range | PRIMARY,my_fk | PRIMARY | 4       | NULL | 505440 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+-------------+

These are the relevant facts to understand this behavior: 这些是了解此行为的相关事实:

  • You are using InnoDB which uses a Clustered Index concept. 您正在使用InnoDB,它使用Clustered Index概念。 The single interesting side effect of Clustered Indexes for your particular case is that every non-primary key index will also contain the primary key as the very last column in the index implicitly. 对于您的特定情况,聚簇索引的唯一有趣的副作用是每个非主键索引还将隐式包含主键作为索引中的最后一列。 No nedd for an index on (owner_id, post_id) — you already have it. 对于(owner_id, post_id)的索引没有nedd - 你已经拥有了它。

  • MySQL can't resolve range conditions (<, >) on non-leading index columns in the right way. MySQL无法以正确的方式解决非前导索引列上的范围条件(<,>)。 Instead, it will just ignore them during index lookup, and later on apply this part of the where clause as a filter. 相反,它将在索引查找期间忽略它们,稍后将where子句的这一部分应用为过滤器。 This is just a MySQL limitation to not start the scan directly at the position of post_id = 900000 — other databases do this very fine. 这只是一个MySQL限制,不能直接在post_id = 900000的位置开始扫描 - 其他数据库这样做非常好。

  • When you are using DESC order, MySQL will start reading the index with the biggest post_id value it finds. 当您使用DESC命令时,MySQL将开始读取它找到的最大post_id值的索引。 It will then apply your filter post_id > 900000 . 然后它会应用你的过滤器post_id > 900000 If it matches, it returns the row. 如果匹配,则返回该行。 Then it proceeds to the next row and so on until it has found 10 matching rows. 然后它继续前进到下一行,依此类推,直到找到10个匹配的行。 However, all the matching rows are guaranteed to be where the index scan started. 但是,所有匹配的行都保证是索引扫描开始的位置。

  • When you are using ASC order, MySQL starts reading the index at the other end, checks this value against post_id > 900000 and will probably need to discard the row because post_id is below that threshold. 当您使用ASC命令时,MySQL开始读取另一端的索引,检查此值对post_id > 900000并且可能需要丢弃该行,因为post_id低于该阈值。 Now guess how many rows it needs to process this way before it finds the first row that matches post_id > 900000 ? 现在猜想在找到匹配post_id > 900000的第一行之前需要处理多少行? That's what's eating up your time. 那就是你在节省时间。

  • "Using Index Condition" refers to Index Condition Pushdown: http://dev.mysql.com/doc/refman/5.6/en/index-condition-pushdown-optimization.html I'd say it should apply in both cases. “使用索引条件”是指索引条件下推: http//dev.mysql.com/doc/refman/5.6/en/index-condition-pushdown-optimization.html我会说它应该适用于这两种情况。 However, it is not so relevant in the DESC case because the filter doesn't remove any rows anyway. 但是,它在DESC情况下并不那么重要,因为过滤器无论如何都不会删除任何行。 In the ASC case it is very relevant and performance would be worst without it. 在ASC案例中,它非常相关,没有它,性能会最差。

If you wan't to verify my statements, you could 如果你不想验证我的陈述,你可以

  • Increase/decrease the numeric value (900000) and see how the performance changes. 增加/减少数值(900000)并查看性能如何变化。 Lower values should make ASC faster while keeping DESC fast too. 较低的值应该使ASC更快,同时保持DESC快速。

  • Change the range condition > to < and see if it reverses the performance behavior of ASC / DESC . 将范围条件>更改为<并查看它是否反转ASC / DESC的性能行为。 Remember that you might need to change the number to some lower value to actually see the performance difference. 请记住,您可能需要将数字更改为较低的值才能真正看到性能差异。

How could one possibly know that? 怎么可能知道呢?

http://use-the-index-luke.com/ is my guide that explains how indexes work. http://use-the-index-luke.com/是我的指南,解释了索引的工作原理。

It is nothing because “Using index condition” but how MySQL use INDEX and their query-engine works. 这没什么,因为“使用索引条件”,但MySQL如何使用INDEX及其查询引擎。 MySQL use a simple query analyzer and optimizer. MySQL使用简单的查询分析器和优化器。

In the case of post_id > 900000 and owner_id = 20 , you may notice it try to use key my_fk which is a "BIGGER INDEX" as it is sized in (64+32)*rows. post_id > 900000 and owner_id = 20的情况下,您可能会注意到它尝试使用密钥my_fk ,这是一个“BIGGER INDEX”,因为它的大小为(64 + 32)*行。 It find all owner_id = 20 from index (yep, post_id was not used. stupid mysql) 它从索引中找到所有owner_id = 20 (是的,post_id没用过owner_id = 20 mysql)

After MySQL used a BIG and HEAVIER index to locate all the rows you need, it do another lookup to read actual rows (because you do SELECT * ) by their primary keys, (few more HDD seek here), and filter the result by using post_id > 900000 (SLOW) 在MySQL使用BIG和HEAVIER索引来定位您需要的所有行之后,它会执行另一次查找以通过其主键读取实际行(因为您执行SELECT * )(此处有更多HDD寻找),并使用以下内容过滤结果post_id > 900000 (慢)

In the case of order by post_id desc , it run faster could be many reason. order by post_id desc的情况下,它运行得更快可能有很多原因。 One possible reason is the InnoDB cache, least inserted rows are warmer and easier to access then others. 一个可能的原因是InnoDB缓存,插入最少的行比其他行更温暖,更容易访问。

In the case of post_id > 900000 and owner_id > 19 and owner_id < 20 , MySQL giveup the my_fk as a ranged scan on secondary index is not better then ranged scan on primary index. post_id > 900000 and owner_id > 19 and owner_id < 20 ,MySQL放弃my_fk作为辅助索引上的远程扫描并不比主索引上的远程扫描更好。

It just use the PK to locate the right page of post_id 900000, and do a SEQUENCE READ from there, if your InnoDB page is not fragmented. 它只是使用PK来找到post_id 900000的正确页面,如果您的InnoDB页面没有碎片,则从那里进行SEQUENCE READ (assume you are using AUTO_INCREMENT) scan some pages, and filter what match your need. (假设您正在使用AUTO_INCREMENT)扫描一些页面,并过滤符合您需要的内容。

To do a "Optimization", (do it now) : Don't use SELECT * 要做“优化”,(现在就做):不要使用SELECT *

To do a "Premature Optimization" (don't do it; don't do it yet); 做一个“过早优化”(不要这样做;不要这样做); hint MySQL by USE INDEX ; 通过USE INDEX提示MySQL; create a index contains exact all the columns you need. 创建索引包含您需要的所有列。

It is hard to say which is faster, my_fk and PK . 很难说哪个更快, my_fkPK Because the performance is various by the pattern of data. 因为数据模式的性能各不相同。 If owner_id = 20 is dominate or common in your table, using PK directly could be faster. 如果owner_id = 20在您的表中占主导地位或常见,则直接使用PK可能会更快。

If owner_id = 20 is not common in your table, my_fk will give a boost as there are too many rows to read until (post_id > 900000 + XXX). 如果owner_id = 20在您的表中不常见, my_fk将会提升,因为要读取的行数太多(post_id> 900000 + XXX)。

-- EDIT: BTW, try ORDER BY owner_id ASC, post_id ASC or DESC. - 编辑:BTW,尝试ORDER BY owner_id ASC, post_id ASC或DESC。 MySQL will faster if it can just use the INDEX's order(not order the index). 如果只能使用INDEX的顺序(而不是命令索引),MySQL会更快。

I'm not a MySQL expert, but I don't think that either query is using an Index - unless there are indexes you have created which you haven't told us about. 我不是MySQL专家,但我不认为任何一个查询都使用索引 - 除非您创建的索引没有告诉我们。 In thing that 'Using Index condition' is possibly an artefact of the way MySQL implements the LIMIT keyword. 在'使用索引条件'可能是MySQL实现LIMIT关键字的方式的假象。

If you put an index made up of (owner_id, post_id) on your post table it will help these two queries. 如果您将一个由(owner_id,post_id)组成的索引放在帖子表上,它将有助于这两个查询。 In MySQL it should look something like: 在MySQL中它应该看起来像:

create index ix_post_userpost on post (owner_id, post_id)

(I don't guarantee that syntax as I don't have MySQL.) (我不保证语法,因为我没有MySQL。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM