I've got a database of ~10 million entries, each of which contains a date stored as DATE .
I've indexed that column using a non-unique BTREE.
I'm running a query that counts the number of entries for each distinct year:
SELECT DISTINCT(YEAR(awesome_date)) as year, COUNT(id) as count
FROM all_entries
WHERE awesome_date IS NOT NULL
GROUP BY YEAR(awesome_date)
ORDER BY year DESC;
The query takes about 90 seconds to run at the moment, and the EXPLAIN output demonstrates why:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
----------------------------------------------------------------------------------------------------------------------------------------
1 | SIMPLE | all_entries | ALL | awesome_date | | | | 9759848 | Using where; Using temporary; Using filesort
If I FORCE KEY(awesome_date)
that drops the rows count down to ~8 million and the key_len = 4
, but is still Using where; Using temporary; Using filesort
Using where; Using temporary; Using filesort
Using where; Using temporary; Using filesort
.
I also run queries selecting DISTINCT(MONTH(awesome_date))
and DISTINCT(DAY(awesome_date))
with the relevant WHERE
conditions restricting them to a particular year or month.
Other than storing the year, month and day information in separate columns, is there a way of speeding up this query and/or avoiding temporary tables and filesort?
Without splitting the date to 3 columns, you could:
First, you should remove the DISTINCT, it is useless. – ypercube 1 min ago edit
Remove the ORDER BY year
, it would help improve speed (a bit). Change the Group By
to: GROUP BY YEAR(awesome_date) DESC
(this works in MySQL dialect only).
Change the COUNT(id)
to COUNT(*)
(assuming that id
can never be NULL
, this is faster in many MySQL versions).
In all, the query will become:
SELECT YEAR(awesome_date) AS year
, COUNT(*) AS cnt --- not good practise to use reserved words
--- for aliases
FROM all_entries
WHERE awesome_date IS NOT NULL
GROUP BY YEAR(awesome_date) DESC ;
Even better (faster) solutions are:
your proposal to split the column into 3 (year, month, day)
change from MySQL to MariaDB (that is a MySQL fork) and use VIRTUAL PERISTENT
column for the year, and add an index on that virtual column.
stay in MySQL and add a persistent year
column yourself - by using triggers.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.