Could you help me with this query, please?
SELECT p.patid, MAX(c1.eventdate) as eventdate
from patient as p
left join op_adv_effects._clinical as c1 on p.patid = c1.patid
where c1.eventdate < p.case_index
group by p.patid
Here is the output of SHOW CREATE TABLE for the 2 tables:
patient CREATE TABLE `patient` (
`patid` int(10) unsigned NOT NULL,
`case_index` date NOT NULL,
PRIMARY KEY (`patid`,`case_index`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs
_clinical CREATE TABLE `_clinical` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`patid` int(10) unsigned NOT NULL,
`eventdate` date NOT NULL,
`medcode` mediumint(8) unsigned DEFAULT NULL,
`adid` mediumint(8) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_clin_eventdate_medcode` (`patid`,`eventdate`,`medcode`),
KEY `idx_clin_eventdate_adid` (`patid`,`eventdate`,`adid`)
) ENGINE=InnoDB AUTO_INCREMENT=62407536 DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs
"explain" returns the following:
*************************** 1. row ********************
id: 1
select_type: SIMPLE
table: p
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 7
ref: NULL
rows: 182939
Extra: Using index
*************************** 2. row ********************
id: 1
select_type: SIMPLE
table: c1
type: ref
possible_keys: idx_clin_eventdate_medcode,idx_clin_eventdate_adid
key: idx_clin_eventdate_medcode
key_len: 4
ref: gprd_opadveff_extra_elisa.p.patid
rows: 171
Extra: Using where; Using index
Why is it not using the first 2 fields of idx_clin_eventdate_medcode, ie (patid,eventdate), but only patid (see ref column)?
If I change the where condition to equality, it works fine:
SELECT p.patid, MAX(c1.eventdate) as eventdate
from patient as p
left join op_adv_effects._clinical as c1 on p.patid = c1.patid
where c1.eventdate = p.case_index
group by p.patid
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 7
ref: NULL
rows: 182939
Extra: Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: c1
type: ref
possible_keys: idx_clin_eventdate_medcode,idx_clin_eventdate_adid
key: idx_clin_eventdate_medcode
key_len: 7
ref: gprd_opadveff_extra_elisa.p.patid,gprd_opadveff_extra_elisa.p.cas
e_index
rows: 1
Extra: Using index
Same results for some suggested variations:
explain SELECT patid,
(SELECT eventdate
FROM op_adv_effects._clinical
WHERE patid = p.patid
AND eventdate < p.case_index
ORDER BY eventdate DESC
LIMIT 1 ) AS eventdate
FROM patient AS p;
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: p
type: index
possible_keys: NULL
key: PRIMARY
key_len: 7
ref: NULL
rows: 182939
Extra: Using index
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: _clinical
type: ref
possible_keys: idx_clin_eventdate_medcode,idx_clin_eventdate_adid
key: idx_clin_eventdate_medcode
key_len: 4
ref: gprd_opadveff_extra_elisa.p.patid
rows: 171
Extra: Using where; Using index; Using filesort
explain SELECT patid,
( SELECT MAX(eventdate)
FROM op_adv_effects._clinical
WHERE patid = p.patid
AND eventdate < p.case_index) AS eventdate
FROM patient AS p;
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: p
type: index
possible_keys: NULL
key: PRIMARY
key_len: 7
ref: NULL
rows: 182939
Extra: Using index
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: _clinical
type: ref
possible_keys: idx_clin_eventdate_medcode,idx_clin_eventdate_adid
key: idx_clin_eventdate_medcode
key_len: 4
ref: gprd_opadveff_extra_elisa.p.patid
rows: 171
Extra: Using where; Using index
The query is part of a more complex one, reported below. However, this is only one example of several complex queries, that should all use the missing part of the index on eventdate. For this reason it's quite important.
CREATE TABLE bmi_lp
(PRIMARY KEY (patid))
ENGINE=INNODB DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs
SELECT tmp.patid, a2.data3 as bmi_lp, tmp.eventdate as bmi_lp_date
from (
SELECT p.patid, MAX(c.eventdate) as eventdate
from patient as p
left join op_adv_effects._clinical as c1 on p.patid = c1.patid
left join op_adv_effects._additional as a1 on c1.patid = a1.patid
where c1.adid <> 0 and c1.adid = a1.adid
and a1.enttype = 13
and a1.data3 is not null
and c1.eventdate < p.case_index
group by p.patid
order by p.patid) as tmp
left join op_adv_effects._clinical as c2 on tmp.patid = c2.patid
left join op_adv_effects._additional as a2 on c2.patid = a2.patid
where tmp.eventdate = c2.eventdate and c2.adid = a2.adid
Due to the WHERE
you are doing an INNER JOIN
right now. Did you intend that?
Regardless, because of the <
the index cannot be used right now, if you had an index that would use a different order it could work.
For example, in PostgreSQL you could do this:
CREATE INDEX idx_clin_eventdate_medcode ON _clinical (patid ASC, eventdate DESC);
In MySQL the DESC
and ASC
operators are no-op unfortunately (with every MySQL version up to 5.7 at least). So unless you can reverse the query (use >
instead of <
), MySQL can't use the index effectively.
Note that it might even be faster not to use the index, it depends on the case. Since it's only going through 171 rows I wouldn't be too worried.
Give this a try:
SELECT patid,
( SELECT MAX(eventdate)
FROM op_adv_effects._clinical
WHERE patid = p.patid
AND eventdate < p.case_index
) AS eventdate
FROM patient AS p;
(No GROUP BY
needed.)
Here's a variant that uses LIMIT 1
instead of MAX
:
SELECT patid,
( SELECT eventdate
FROM op_adv_effects._clinical
WHERE patid = p.patid
AND eventdate < p.case_index
ORDER BY eventdate DESC
LIMIT 1
) AS eventdate
FROM patient AS p;
How many rows are in the output?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.