简体   繁体   中英

mysql query - optimizing existing MAX-MIN query for a huge table

I have a more or less good working query (concerning to the result) but it takes about 45seconds to be processed. That's definitely too long for presenting the data in a GUI.
So my demand is to find a much faster/efficient query (something around a few milliseconds would be nice) My data table has something around 3000 ~2,619,395 entries and is still growing.

Schema:

num | station | fetchDate             | exportValue | error
1   | PS1     | 2010-10-01 07:05:17   | 300         | 0
2   | PS2     | 2010-10-01 07:05:19   | 297         | 0
923 | PS1     | 2011-11-13 14:45:47   | 82771       | 0

Explanation

  • the exportValue is always incrementing
  • the exportValue represents the actual absolute value
  • in my case there are 10 stations
  • every ~15 minutes 10 new entries are written to the table
  • error is just an indicator for a proper working station

Working query:

select
    YEAR(fetchDate), station, Max(exportValue)-MIN(exportValue)
from
    registros
where
    exportValue > 0 and error = 0 
group
    by station, YEAR(fetchDate)
order 
    by YEAR(fetchDate), station

Output:

Year | station | Max-Min
2008 | PS1     | 24012
2008 | PS2     | 23709
2009 | PS1     | 28102
2009 | PS2     | 25098

My thoughts on it:

  1. writing several queries with between statements like 'between 2008-01-01 and 2008-01-02' to fetch the MIN(exportValue) and between 2008-12-30 and 2008-12-31' to grab the MAX(exportValue) - Problem: a lot of queries and the problem with having no data in a specified time range (it's not guaranteed that there will be data)
  2. limiting the resultset to my 10 stations only with using order by MIN(fetchDate) - problem: takes also a long time to process the query

Additional Info:
I'm using the query in a JAVA Application. That means, it would be possible to do some post-processing on the resultset if necessary. (JPA 2.0)

Any help/approaches/ideas are very appreciated. Thanks in advance.

Adding suitable indexes will help. 2 compound indexes will speed things up significantly:

ALTER TABLE tbl_name ADD INDEX (error, exportValue);
ALTER TABLE tbl_name ADD INDEX (station, fetchDate);

This query running on 3000 records should be extremely fast.

Suggestions:

  • do You have PK set on this table? station, fetchDate?
  • add indexes; You should experiment and try with indexes as rich.okelly suggested in his answer
  • depending on experiments with indexes, try breaking your query into multiple statements - in one stored procedure; this way You will not loose time in network traffic between multiple queries sent from client to mysql
  • You mentioned that You tried with separate queries and there is a problem when there is no data for particular month; it is regular case in business applications, You should handle it in a "master query" (stored procedure or application code)
  • guess fetchDate is current date and time at the moment of record insertion; consider keeping previous months data in sort of summary table with fields: year, month, station, max(exportValue), min(exportValue) - this means that You should insert summary records in summary table at the end of each month; deleting, keeping or moving detail records to separate table is your choice

Since your table is rapidly growing (every 15 minutes) You should take the last suggestion into account. Probably, there is no need to keep detailed history at one place. Archiving data is process that should be done as part of maintenance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM