简体   繁体   English

如何加快此SQL查询

[英]How do I speed up this SQL query

I have the following query: 我有以下查询:

select min(a) from tbl where b > ?;

and it takes about 4 seconds on my mysql instance with index(b, a) (15M rows). 在带有index(b, a) (1500万行)的mysql实例上大约需要4秒钟。 Is there a way to speed it up? 有没有办法加快速度?

Explain: 说明:

explain select min(parsed_id) from replays where game_date > '2016-10-01';

id:            1
select_type:   SIMPLE
table:         replays
partitions:    NULL
type:          range
possible_keys: replays_game_date_index,replays_game_date_parsed_id_index
key:           replays_game_date_parsed_id_index
key_len:       6
ref:           NULL
rows:          6854021
filtered:      100.00
Extra:         Using where; Using index

Index statement: 索引声明:

create index replays_game_date_parsed_id_index on replays (game_date, parsed_id);

I think the index MySQL is using is the right one. 我认为MySQL使用的索引是正确的。 The query should be instantaneous since a SINGLE read from the index should return the result you want. 该查询应该是瞬时的,因为从索引中读取一个应该返回您想要的结果。 I guess for this query MySQL's SQL optimizer is doing a very poor job. 我猜对于这个查询,MySQL的SQL优化器做得很糟糕。

Maybe you could rephrase your query to trick the SQL optimizer onto using a different strategy. 也许您可以改写查询,以使用其他策略欺骗SQL优化器。 Maybe you can try: 也许您可以尝试:

select parsed_id 
from replays
where game_date > '2016-10-01'
order by parsed_id
limit 1

Is this version any faster? 这个版本更快吗?

select @mina
fro (select (@mina := least(@mina, a)) as mina
     from tbl cross join
          (select @mina := 999999) params
     where b > ?
    ) t
limit 1;

I suspect this won't make much difference, but I'm not sure what happens under the hood with such a large aggregation function running over an index. 我怀疑这不会有太大的区别,但是我不确定在索引上运行如此大的聚合函数会在后台发生什么。

This may or may not help: Change the query and add an index: 这可能会或可能不会有帮助:更改查询并添加索引:

SELECT a FROM tbl WHERE b > ? ORDER BY a LIMIT 1;

INDEX(a, b)

Then, if a matching b occurs soon enough in the table, this will be faster than the other suggestions. 然后, 如果在表中出现匹配b时间足够早,则这将比其他建议更快。

On the other hand, if the only matching b is near the end of the table, this will have to scan nearly all the index and be slower than the other options. 另一方面,如果唯一匹配的b在表的末尾附近,则必须扫描几乎所有索引,并且比其他选项要慢。

a needs to be first in the index. a需要是首先在索引。 By having both columns in the index, it becomes a "covering" index, hence a bit faster. 通过将两个列都包含在索引中,它成为“覆盖”索引,因此速度更快。

It may be that using my SELECT , together with two indexes will give the Optimizer enough to pick the better approach: 可能是将我的SELECT两个索引一起使用将使Optimizer足以选择更好的方法:

INDEX(a,b)
INDEX(b,a)

Schema 架构

Adding either (or both) composite indexes should help. 添加一个(或两个)复合索引应该会有所帮助。

Shrinking the table size is likely to help... 缩小表的大小可能会有所帮助...

  • INT takes 4 bytes. INT占用4个字节。 Consider whether a smaller datatype would suffice for any of those columns. 考虑较小的数据类型是否足以满足这些列中的任何列。
  • There are 3 dates ( DATETIME , TIMESTAMP ); 有3个日期( DATETIMETIMESTAMP ); do you need all of them? 需要所有这些吗?
  • Is fingerprint varchar(36) a UUID/GUID? fingerprint varchar(36)是UUID / GUID吗? If so, it could be packed into BINARY(16) . 如果是这样,则可以将其打包到BINARY(16)

640MB is tight -- check the graphs to make sure there is no "swapping". 640MB太紧了-检查图表以确保没有“交换”。 (Swapping would be really bad for performance.) (交换对于性能确实是不好的。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM