I have a database scheme that looks like this (see http://sqlfiddle.com/#!2/4c9b4/1/0 ):
create table t( id int, dataA int, dataB int);
insert into t select 1 ,1 ,1;
insert into t select 2 ,1 ,2;
insert into t select 3 ,1 ,3;
insert into t select 4 ,2 ,1;
insert into t select 5 ,2 ,2;
insert into t select 6 ,2 ,4;
insert into t select 7 ,3 ,1;
insert into t select 8 ,3 ,2;
insert into t select 9 ,4 ,1;
And an SQL query to fetch a list of "dataA" for the maximum "dataB" corresponding to "dataA"
SELECT * FROM t a WHERE dataB = (SELECT MAX(dataB) FROM t b WHERE b.dataA = a.dataA)
It works OK, however it can take up to 90 seconds to run on my dataset.
How can I improve performance of this query ?
Maybe MySQL executes the subquery again and again even for repeated dataA. The following statement just finds the max(dataB) once for each dataA. The rest is a simple join. Hope this is faster.
select t.*
from t
join (select dataA, max(dataB) as maxDataB from t group by dataA) max_t
on t.dataA = max_t.dataA and t.dataB = max_t.maxDataB;
EDIT: Here is your SQL fiddle: http://sqlfiddle.com/#!2/4c9b4/2 .
MySQL does not do aggregation so well. The first thing to try is an index:
create index t_dataA_dataB on t(dataA, dataB);
That will probably fix the problem. The second is to use the following trick:
select a.*
from t a
where not exists (select 1
from t a2
where a2.dataA = a.dataA and
a2.dataB > a.dataB
);
This transforms the "get me the max" to the equivalent: "Get me all rows from t
where there are no rows with the same dataA
and a bigger dataB
".
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.