[英]How do I speed up this SQL query
I have the following query: 我有以下查询:
select min(a) from tbl where b > ?;
and it takes about 4 seconds on my mysql instance with index(b, a)
(15M rows). 在带有
index(b, a)
(1500万行)的mysql实例上大约需要4秒钟。 Is there a way to speed it up? 有没有办法加快速度?
Explain: 说明:
explain select min(parsed_id) from replays where game_date > '2016-10-01';
id: 1
select_type: SIMPLE
table: replays
partitions: NULL
type: range
possible_keys: replays_game_date_index,replays_game_date_parsed_id_index
key: replays_game_date_parsed_id_index
key_len: 6
ref: NULL
rows: 6854021
filtered: 100.00
Extra: Using where; Using index
Index statement: 索引声明:
create index replays_game_date_parsed_id_index on replays (game_date, parsed_id);
I think the index MySQL is using is the right one. 我认为MySQL使用的索引是正确的。 The query should be instantaneous since a SINGLE read from the index should return the result you want.
该查询应该是瞬时的,因为从索引中读取一个应该返回您想要的结果。 I guess for this query MySQL's SQL optimizer is doing a very poor job.
我猜对于这个查询,MySQL的SQL优化器做得很糟糕。
Maybe you could rephrase your query to trick the SQL optimizer onto using a different strategy. 也许您可以改写查询,以使用其他策略欺骗SQL优化器。 Maybe you can try:
也许您可以尝试:
select parsed_id
from replays
where game_date > '2016-10-01'
order by parsed_id
limit 1
Is this version any faster? 这个版本更快吗?
select @mina
fro (select (@mina := least(@mina, a)) as mina
from tbl cross join
(select @mina := 999999) params
where b > ?
) t
limit 1;
I suspect this won't make much difference, but I'm not sure what happens under the hood with such a large aggregation function running over an index. 我怀疑这不会有太大的区别,但是我不确定在索引上运行如此大的聚合函数会在后台发生什么。
This may or may not help: Change the query and add an index: 这可能会或可能不会有帮助:更改查询并添加索引:
SELECT a FROM tbl WHERE b > ? ORDER BY a LIMIT 1;
INDEX(a, b)
Then, if a matching b
occurs soon enough in the table, this will be faster than the other suggestions. 然后, 如果在表中出现匹配
b
时间足够早,则这将比其他建议更快。
On the other hand, if the only matching b
is near the end of the table, this will have to scan nearly all the index and be slower than the other options. 另一方面,如果唯一匹配的
b
在表的末尾附近,则必须扫描几乎所有索引,并且比其他选项要慢。
a
needs to be first in the index. a
需要是首先在索引。 By having both columns in the index, it becomes a "covering" index, hence a bit faster. 通过将两个列都包含在索引中,它成为“覆盖”索引,因此速度更快。
It may be that using my SELECT
, together with two indexes will give the Optimizer enough to pick the better approach: 可能是将我的
SELECT
与两个索引一起使用将使Optimizer足以选择更好的方法:
INDEX(a,b)
INDEX(b,a)
Schema 架构
Adding either (or both) composite indexes should help. 添加一个(或两个)复合索引应该会有所帮助。
Shrinking the table size is likely to help... 缩小表的大小可能会有所帮助...
INT
takes 4 bytes. INT
占用4个字节。 Consider whether a smaller datatype would suffice for any of those columns. DATETIME
, TIMESTAMP
); DATETIME
, TIMESTAMP
); do you need all of them? fingerprint varchar(36)
a UUID/GUID? fingerprint varchar(36)
是UUID / GUID吗? If so, it could be packed into BINARY(16)
. BINARY(16)
。 640MB is tight -- check the graphs to make sure there is no "swapping". 640MB太紧了-检查图表以确保没有“交换”。 (Swapping would be really bad for performance.)
(交换对于性能确实是不好的。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.