简体   繁体   English

使用适当的索引优化mysql查询

[英]Optimizing mysql query with the proper index

I have a table of 15.1 million records. 我有一张1510万记录的表。 I'm running the following query on it to process the records for duplicate checking. 我正在运行以下查询,以处理记录以进行重复检查。

select id, name, state, external_id 
from companies
where dup_checked=0 
order by name 
limit 500;

When I use explain extended on the query it tells me it's using the index_companies_on_name index which is just an index on the company name. 当我在查询中使用扩展说明时,它告诉我它正在使用index_companies_on_name索引,该索引仅是公司名称的索引。 I'm assuming this is due to the ordering. 我假设这是由于订购。 I tried creating other indexes based on the name and dup_checked fields hoping it would use this one as it may be faster, but it still uses the index_companies_on_name index. 我尝试根据名称和dup_checked字段创建其他索引,希望它可以使用此索引,因为它可能更快,但仍使用index_companies_on_name索引。

Initially it was fast enough, but now we're down to 3.3 million records left to check and this query is taking up to 90 seconds to execute. 最初它足够快,但是现在我们只剩下330万条记录需要检查,并且此查询最多需要90秒才能执行。 I'm not quite sure what else to do to make this run faster. 我不太确定该怎么做才能使运行速度更快。 Is a different index the answer or something else I'm not thinking of? 答案是其他索引还是其他我没有想到的索引? Thanks. 谢谢。

Generally the trick here is to create an index that filters first, reducing the number of rows ("Cardinality"), and has the ordering applied secondarily: 通常,这里的技巧是创建一个索引,该索引首先进行过滤,以减少行数(“基数”),然后对其进行排序:

CREATE INDEX `index_companies_on_dup_checked_name`
  ON `companies` (`dup_checked`,`name`)

That should give you the scope you need. 那应该给您您需要的范围。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM