简体   繁体   中英

Optimize not in sub-query MYSQL

I've got table in MySQL db where >20 000 000 rows, the query below executes great on small amount of rows, but takes 2-3 secs if there are more. How can I optimize this to make it run < 1 at least? Note - the problem is in sub-query SELECT read_state FROM messages... Query:

SELECT sql_no_cache users.id AS uid,
  name,
  avatar,
  avatar_date,
  driver,
  msg,
  DATE,
  messages.removed,
  from_id = 528798 AS outbox ,
  !(0    IN
  (SELECT read_state
  FROM messages AS msgs FORCE KEY(user_id_2)
  WHERE (msgs.from_id = messages.from_id
  OR msgs.from_id = messages.user_id)
  AND msgs.user_id = 528798
  AND removed = 0
  )) AS read_state
FROM dialog,
  messages,
  users
WHERE messages.id = mid
AND ((uid1 = 528798
AND users.id = uid2)
OR (uid2 = 528798
AND users.id = uid1))
ORDER BY DATE DESC;

show index from messages;

+----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name    | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| messages |          0 | PRIMARY     |            1 | id          | A         |    27531939 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | to_number   |            1 | to_number   | A         |          22 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | from_id     |            1 | from_id     | A         |      529460 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | from_id     |            2 | to_number   | A         |      529460 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | user_id_2   |            1 | user_id     | A         |      655522 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | user_id_2   |            2 | read_state  | A         |      917731 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | user_id_2   |            3 | removed     | A         |      949377 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | idx_user_id |            1 | user_id     | A         |      809762 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | idx_from_id |            1 | from_id     | A         |      302548 |     NULL | NULL   |      | BTREE      |         |               |
+----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

desc messages;

+------------+-------------+------+-----+---------+----------------+
| Field      | Type        | Null | Key | Default | Extra          |
+------------+-------------+------+-----+---------+----------------+
| id         | int(11)     | NO   | PRI | NULL    | auto_increment |
| from_id    | int(11)     | NO   | MUL | NULL    |                |
| user_id    | int(11)     | NO   | MUL | NULL    |                |
| group_id   | int(11)     | NO   |     | NULL    |                |
| to_number  | varchar(30) | NO   | MUL | NULL    |                |
| msg        | text        | NO   |     | NULL    |                |
| image      | varchar(20) | NO   |     | NULL    |                |
| date       | bigint(20)  | NO   |     | NULL    |                |
| read_state | tinyint(1)  | NO   |     | 0       |                |
| removed    | tinyint(1)  | NO   |     | NULL    |                |
+------------+-------------+------+-----+---------+----------------+

EXPLAIN EXTENDED:

+----+--------------------+----------+-------------+---------------+-----------+---------+--------------------+--------+----------+---------------------------------------------------------------------------+
| id | select_type        | table    | type        | possible_keys | key       | key_len | ref                | rows   | filtered | Extra                                                                     |
+----+--------------------+----------+-------------+---------------+-----------+---------+--------------------+--------+----------+---------------------------------------------------------------------------+
|  1 | PRIMARY            | dialog   | index_merge | uid1,uid2     | uid1,uid2 | 4,4     | NULL               |   1707 |   100.00 | Using sort_union(uid1,uid2); Using where; Using temporary; Using filesort |
|  1 | PRIMARY            | users    | ALL         | PRIMARY       | NULL      | NULL    | NULL               | 608993 |   100.00 | Range checked for each record (index map: 0x1)                            |
|  1 | PRIMARY            | messages | eq_ref      | PRIMARY       | PRIMARY   | 4       | numbers.dialog.mid |      1 |   100.00 |                                                                           |
|  2 | DEPENDENT SUBQUERY | msgs     | ref         | user_id_2     | user_id_2 | 6       | const,const,const  |   2607 |   100.00 | Using where                                                               |
+----+--------------------+----------+-------------+---------------+-----------+---------+--------------------+--------+----------+---------------------------------------------------------------------------+

Making a few guesses, something like this might be more efficient:-

SELECT DISTINCT users.id AS uid,
  name,
  avatar,
  avatar_date,
  driver,
  msg,
  `DATE`,
  messages.removed,
  from_id = 528798 AS outbox ,
  CASE WHEN msgs.read_state IS NULL THEN 1 ELSE 0 END AS read_state
FROM messages
INNER JOIN dialog ON messages.id = dialog.mid
INNER JOIN users ON (dialog.uid1 = 528798 AND users.id = dialog.uid2) OR (dialog.uid2 = 528798 AND users.id = dialog.uid1)
LEFT OUTER JOIN messages msgs ON msgs.read_state = 0 AND msgs.user_id = 528798 AND removed = 0 AND (msgs.from_id = messages.from_id OR msgs.from_id = messages.user_id)
ORDER BY `DATE` DESC;

This is doing an extra join as a LEFT JOIN against messages again, and then using case to convert the result to 0 or 1.

the DISTINCT should cope when the LEFT JOIN can bring back multiple matching rows (if that is not possible then you can elminate the DISTINCT)

Suspect the OR clauses in the join onto users will not be that efficient. May be better to replace the INNER JOIN against users with 2 LEFT OUTER JOINs. Something like this:-

SELECT DISTINCT COALESCE(users1.id, users2.id) AS uid,
  COALESCE(users1.name, users2.name),
  COALESCE(users1.avatar, users2.avatar),
  COALESCE(users1.avatar_date, users2.avatar_date),
  COALESCE(users1.driver, users2.driver),
  msg,
  `DATE`,
  messages.removed,
  from_id = 528798 AS outbox ,
  CASE WHEN msgs.read_state IS NULL THEN 1 ELSE 0 END AS read_state
FROM messages
INNER JOIN dialog ON messages.id = dialog.mid
LEFT OUTER JOIN users users1 ON (dialog.uid1 = 528798 AND users1.id = dialog.uid2)
LEFT OUTER JOIN users users2 ON (dialog.uid2 = 528798 AND users2.id = dialog.uid1)
LEFT OUTER JOIN messages msgs ON msgs.read_state = 0 AND msgs.user_id = 528798 AND removed = 0 AND (msgs.from_id = messages.from_id OR msgs.from_id = messages.user_id)
WHERE users1.id IS NOT NULL
OR users2.id IS NOT NULL
ORDER BY `DATE` DESC;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM