Optimizing mysql query with the proper index

Question

I have a table of 15.1 million records. I'm running the following query on it to process the records for duplicate checking.

select id, name, state, external_id 
from companies
where dup_checked=0 
order by name 
limit 500;

When I use explain extended on the query it tells me it's using the index_companies_on_name index which is just an index on the company name. I'm assuming this is due to the ordering. I tried creating other indexes based on the name and dup_checked fields hoping it would use this one as it may be faster, but it still uses the index_companies_on_name index.

Initially it was fast enough, but now we're down to 3.3 million records left to check and this query is taking up to 90 seconds to execute. I'm not quite sure what else to do to make this run faster. Is a different index the answer or something else I'm not thinking of? Thanks.

Answer 1

Generally the trick here is to create an index that filters first, reducing the number of rows ("Cardinality"), and has the ordering applied secondarily:

CREATE INDEX `index_companies_on_dup_checked_name`
  ON `companies` (`dup_checked`,`name`)

That should give you the scope you need.

Optimizing mysql query with the proper index

Question

1 answers

solution1
1 ACCPTED 2015-10-05 18:16:54

Optimizing mysql query with the proper index

Question

1 answers

solution1 1 ACCPTED 2015-10-05 18:16:54

solution1
1 ACCPTED 2015-10-05 18:16:54