简体   繁体   中英

Grouping by year and month with MySQL while leveraging indexes and avoiding temporary/filesort

I have a large dataset (which is going to keep on growing!) where the data being read in bulk is stored with a DATE column, as all rows in any of the core data tables belong to a specific day (context is analytics/reporting).

A lot of the views require data on a per month rather than per day detail level, and I'm aggregating the data as needed via SQL (SUM, AVG, etc).

This also means I'm grouping data by YEAR() and MONTH() , which cannot use the index on the DATE column and results in a Use temporary and Use filesort from the query executor.

Is the best solution here to split the DATE column into 3 separate columns for year, month and day? Or to retain the DATE column (constraint, sorting, etc) and have a "yearmonth" (yyyymm) column which is also indexed? I don't like duplicating data but I'm just not 100% on what would be the best practice for this scenario.

I think the best way in terms of performance with GROUP -ing and SELECT -ing on month and date columns is to add a MONTH and YEAR column to the data. The speed you gain by proper index usage will outnumber the pain of some more / duplicated data.

Note that there is a YEAR datatype in MySQL.

Make sure to use B-TREE indices on month and year column (not HASH ).

Do not split a DATE into component parts. The difficulties outweighs the presumed benefit.

Use Summary Tables to avoid lengthy analytics/reporting. See my blog on such. Roughly speaking, every night you would calculate some subtotals and counts for the past day, and put these in a "Summary Table". Analytics would run much faster against that table than against the "Fact" table.

For AVG, be sure to store SUM() and COUNT(*), the compute (in the Report) SUM(sums) / SUM(counts) as Average .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM