简体   繁体   中英

Search distinct date parts fast in mysql

I've got a database of ~10 million entries, each of which contains a date stored as DATE .

I've indexed that column using a non-unique BTREE.

I'm running a query that counts the number of entries for each distinct year:

SELECT DISTINCT(YEAR(awesome_date)) as year, COUNT(id) as count
FROM all_entries
WHERE awesome_date IS NOT NULL
GROUP BY YEAR(awesome_date)
ORDER BY year DESC;

The query takes about 90 seconds to run at the moment, and the EXPLAIN output demonstrates why:

id | select_type | table        | type  | possible_keys | key | key_len | ref | rows     | Extra
----------------------------------------------------------------------------------------------------------------------------------------
1  | SIMPLE      | all_entries  | ALL   | awesome_date  |     |         |     | 9759848  |  Using where; Using temporary; Using filesort

If I FORCE KEY(awesome_date) that drops the rows count down to ~8 million and the key_len = 4 , but is still Using where; Using temporary; Using filesort Using where; Using temporary; Using filesort Using where; Using temporary; Using filesort .

I also run queries selecting DISTINCT(MONTH(awesome_date)) and DISTINCT(DAY(awesome_date)) with the relevant WHERE conditions restricting them to a particular year or month.

Other than storing the year, month and day information in separate columns, is there a way of speeding up this query and/or avoiding temporary tables and filesort?

Without splitting the date to 3 columns, you could:

  • First, you should remove the DISTINCT, it is useless. – ypercube 1 min ago edit

  • Remove the ORDER BY year , it would help improve speed (a bit). Change the Group By to: GROUP BY YEAR(awesome_date) DESC (this works in MySQL dialect only).

  • Change the COUNT(id) to COUNT(*) (assuming that id can never be NULL , this is faster in many MySQL versions).

In all, the query will become:

SELECT YEAR(awesome_date) AS year
     , COUNT(*) AS cnt              --- not good practise to use reserved words
                                    --- for aliases
FROM all_entries
WHERE awesome_date IS NOT NULL
GROUP BY YEAR(awesome_date) DESC ;

Even better (faster) solutions are:

  • your proposal to split the column into 3 (year, month, day)

  • change from MySQL to MariaDB (that is a MySQL fork) and use VIRTUAL PERISTENT column for the year, and add an index on that virtual column.

  • stay in MySQL and add a persistent year column yourself - by using triggers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM