简体   繁体   中英

How to make this complicated query faster [MySQL]?

I have the next query:

SELECT JL.j_id, COUNT(*) as total
FROM j_log JL
WHERE JL.log_time > '20120205164008'
AND JL.j_id IN (
     SELECT j_id 
     FROM j 
     WHERE checked = '1' 
     AND expires >= '20120207164008'
) GROUP BY JL.j_id ORDER BY total DESC LIMIT 3

j table has big structure 100 fields and 248986 rows inside it.

next KEY's are present in it

  PRIMARY KEY (`j_id`),
  KEY `expires` (`expires`),
  KEY `checked` (`checked`),
  KEY `checked_2` (`checked`,`expires`)

j_log table has about 63000000 records and the next structure

CREATE TABLE `j_log` (
  `j_id` int(11) NOT NULL DEFAULT '0',
  `member_id` int(11) DEFAULT NULL,
  `ip` int(10) unsigned NOT NULL DEFAULT '0',
  `log_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  KEY `j_id` (`j_id`),
  KEY `log_time` (`log_time`),
  KEY `ip` (`ip`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

so the considered query wants to get top3 of most visited j_id instances

this is the plan

+----+--------------------+-------+-----------------+-----------------------------------+---------+---------+------+----------+----------+----------------------------------------------+
| id | select_type        | table | type            | possible_keys                     | key     | key_len | ref  | rows     | filtered | Extra                                        |
+----+--------------------+-------+-----------------+-----------------------------------+---------+---------+------+----------+----------+----------------------------------------------+
|  1 | PRIMARY            | JL    | index           | log_time                          | j_id    | 4       | NULL | 63914602 |     0.36 | Using where; Using temporary; Using filesort |
|  2 | DEPENDENT SUBQUERY | j     | unique_subquery | PRIMARY,expires,checked,checked_2 | PRIMARY | 4       | func |        1 |   100.00 | Using where                                  |
+----+--------------------+-------+-----------------+-----------------------------------+---------+---------+------+----------+----------+----------------------------------------------+

Some times it could take up for 15!!! minutes.

Is there any way how to make faster ?

SELECT JL.j_id, COUNT(*) as total
FROM j_log JL
INNER JOIN j
  ON JL.j_id = j.j_id
  AND j.checked = '1'
  AND j.expires >= '20120207164008'
WHERE JL.log_time > '20120205164008'
GROUP BY JL.j_id
ORDER BY total
DESC LIMIT 3

Will this be faster?

  • Why do you use a subquery?
  • Why is checked a string? ('1' instead of just 1)
  • Why do you compare jl.log_time and j.expires diffrently ( > vs >= )

How about this query

     SELECT j.j_id, COUNT(jl.j_id) as total
       FROM j
  LEFT JOIN j_log jl ON (jl.j_id = j.j_id AND jl.checked = '1' AND jl.log_time > '20120205164008')
      WHERE j.expires >= '20120207164008'

   GROUP BY j.j_id 
   ORDER BY total DESC 
      LIMIT 3

Make sure j_id is the PRIMARY KEY for both tables and put an index on j.expires and jl.checked and jl.logtime. Also make sure the field checked is optimized. I'm not sure what the possible values can be, but I assume it's a boolean field. So rather make the field_type BIT or use an ENUM

Edit

Also you should convert the fields j.expires and jl.log_time to better fields. I think it is just a varchar now, looking at the current value you use: 20120205164008. Convert this into a DATETIME field (but don't just convert the tables because you will lose the data).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM