简体   繁体   中英

Optimizing MySQL query with a composite index

I have a table which currently has about 80 million rows, created as follows:

create table records
(
  id      int auto_increment primary key,
  created int             not null,
  status  int default '0' not null
)
  collate = utf8_unicode_ci;

create index created_and_status_idx
  on records (created, status);

The created column contains unix timestamps and status can be an integer between -10 and 10. The records are evenly distributed regarding the created date, and around half of them are of status 0 or -10.

I have a cron that selects records that are between 32 and 8 days old, processes them and then deletes them, for certain statuses. The query is as follows:

SELECT
    records.id
FROM records
WHERE
    (records.status = 0 OR records.status = -10)
    AND records.created BETWEEN UNIX_TIMESTAMP() - 32 * 86400 AND UNIX_TIMESTAMP() - 8 * 86400
LIMIT 500

The query was fast when the records were at the beginning of the creation interval, but now that the cleanup reaches the records at the end of interval it takes about 10 seconds to run. Explaining the query says it uses the index, but it parses about 40 million records.

My question is if there is anything I can do to improve the performance of the query, and if so, how exactly.

Thank you.

I think union all is your best approach:

(SELECT r.id
 FROM records r
 WHERE r.status = 0 AND
       r.created BETWEEN UNIX_TIMESTAMP() - 32 * 86400 AND UNIX_TIMESTAMP() - 8 * 86400
 LIMIT 500
) UNION ALL
(SELECT r.id
 FROM records r
 WHERE r.status = -10 AND
       r.created BETWEEN UNIX_TIMESTAMP() - 32 * 86400 AND UNIX_TIMESTAMP() - 8 * 86400
 LIMIT 500
) 
LIMIT 500;

This can use an index on records(status, created, id) . Note: use union if records.id could have duplicates.

You are also using LIMIT with no ORDER BY . That is generally discouraged.

Your index is in the wrong order. You should put the IN column ( status ) first (you phrased it as an OR ), and put the 'range' column ( created ) last:

INDEX(status, created)

(Don't give me any guff about "cardinality"; we are not looking at individual columns.)

Are there really only 3 columns in the table? Do you need id ? If not, get rid of it and change to

PRIMARY KEY(status, created)

Other techniques for walking through large tables efficiently.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM