简体   繁体   中英

Why is this MySQL query not using the complete index?

Could you help me with this query, please?

SELECT p.patid, MAX(c1.eventdate) as eventdate 
from patient as p 
left join op_adv_effects._clinical as c1 on p.patid = c1.patid 
where c1.eventdate < p.case_index 
group by p.patid

Here is the output of SHOW CREATE TABLE for the 2 tables:

patient CREATE TABLE `patient` (
  `patid` int(10) unsigned NOT NULL,
  `case_index` date NOT NULL,
  PRIMARY KEY (`patid`,`case_index`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs

_clinical   CREATE TABLE `_clinical` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `patid` int(10) unsigned NOT NULL,
  `eventdate` date NOT NULL,
  `medcode` mediumint(8) unsigned DEFAULT NULL,
  `adid` mediumint(8) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_clin_eventdate_medcode` (`patid`,`eventdate`,`medcode`),
  KEY `idx_clin_eventdate_adid` (`patid`,`eventdate`,`adid`)
) ENGINE=InnoDB AUTO_INCREMENT=62407536 DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs

"explain" returns the following:

*************************** 1. row ********************
           id: 1
  select_type: SIMPLE
        table: p
         type: index
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 7
          ref: NULL
         rows: 182939
        Extra: Using index
*************************** 2. row ********************
           id: 1
  select_type: SIMPLE
        table: c1
         type: ref
possible_keys: idx_clin_eventdate_medcode,idx_clin_eventdate_adid
          key: idx_clin_eventdate_medcode
      key_len: 4
          ref: gprd_opadveff_extra_elisa.p.patid
         rows: 171
        Extra: Using where; Using index

Why is it not using the first 2 fields of idx_clin_eventdate_medcode, ie (patid,eventdate), but only patid (see ref column)?

If I change the where condition to equality, it works fine:

SELECT p.patid, MAX(c1.eventdate) as eventdate 
from patient as p 
left join op_adv_effects._clinical as c1 on p.patid = c1.patid 
where c1.eventdate = p.case_index 
group by p.patid

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
         type: index
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 7
          ref: NULL
         rows: 182939
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: c1
         type: ref
possible_keys: idx_clin_eventdate_medcode,idx_clin_eventdate_adid
          key: idx_clin_eventdate_medcode
      key_len: 7
          ref: gprd_opadveff_extra_elisa.p.patid,gprd_opadveff_extra_elisa.p.cas
e_index
         rows: 1
        Extra: Using index

Same results for some suggested variations:

explain SELECT  patid,
(SELECT  eventdate
FROM  op_adv_effects._clinical
WHERE  patid = p.patid
AND eventdate < p.case_index
ORDER BY  eventdate DESC
LIMIT  1 ) AS eventdate
FROM  patient AS p;

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: p
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 7
          ref: NULL
         rows: 182939
        Extra: Using index
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: _clinical
         type: ref
possible_keys: idx_clin_eventdate_medcode,idx_clin_eventdate_adid
          key: idx_clin_eventdate_medcode
      key_len: 4
          ref: gprd_opadveff_extra_elisa.p.patid
         rows: 171
        Extra: Using where; Using index; Using filesort


explain SELECT  patid, 
( SELECT  MAX(eventdate)
FROM  op_adv_effects._clinical
WHERE  patid = p.patid
AND  eventdate < p.case_index) AS eventdate
FROM  patient AS p;

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: p
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 7
          ref: NULL
         rows: 182939
        Extra: Using index
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: _clinical
         type: ref
possible_keys: idx_clin_eventdate_medcode,idx_clin_eventdate_adid
          key: idx_clin_eventdate_medcode
      key_len: 4
          ref: gprd_opadveff_extra_elisa.p.patid
         rows: 171
        Extra: Using where; Using index

The query is part of a more complex one, reported below. However, this is only one example of several complex queries, that should all use the missing part of the index on eventdate. For this reason it's quite important.

CREATE TABLE bmi_lp
(PRIMARY KEY (patid))
ENGINE=INNODB DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs
SELECT tmp.patid, a2.data3 as bmi_lp, tmp.eventdate as bmi_lp_date 
from ( 
SELECT p.patid, MAX(c.eventdate) as eventdate 
from patient as p 
left join op_adv_effects._clinical as c1 on p.patid = c1.patid 
left join op_adv_effects._additional as a1 on c1.patid = a1.patid 
where c1.adid <> 0 and c1.adid = a1.adid 
and a1.enttype = 13 
and a1.data3 is not null 
and c1.eventdate < p.case_index 
group by p.patid 
order by p.patid) as tmp 
left join op_adv_effects._clinical   as c2 on tmp.patid = c2.patid 
left join op_adv_effects._additional as a2 on c2.patid = a2.patid 
where tmp.eventdate = c2.eventdate and c2.adid = a2.adid

Due to the WHERE you are doing an INNER JOIN right now. Did you intend that?


Regardless, because of the < the index cannot be used right now, if you had an index that would use a different order it could work.

For example, in PostgreSQL you could do this:

CREATE INDEX idx_clin_eventdate_medcode ON _clinical (patid ASC, eventdate DESC);

In MySQL the DESC and ASC operators are no-op unfortunately (with every MySQL version up to 5.7 at least). So unless you can reverse the query (use > instead of < ), MySQL can't use the index effectively.

Note that it might even be faster not to use the index, it depends on the case. Since it's only going through 171 rows I wouldn't be too worried.

Give this a try:

SELECT  patid, 
      ( SELECT  MAX(eventdate)
            FROM  op_adv_effects._clinical
            WHERE  patid = p.patid
              AND  eventdate < p.case_index 
      ) AS eventdate
    FROM  patient AS p;

(No GROUP BY needed.)

Here's a variant that uses LIMIT 1 instead of MAX :

SELECT  patid, 
      ( SELECT  eventdate
            FROM  op_adv_effects._clinical
            WHERE  patid = p.patid
              AND  eventdate < p.case_index
            ORDER BY  eventdate DESC
            LIMIT  1 
      ) AS eventdate
    FROM  patient AS p;

How many rows are in the output?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM