简体   繁体   中英

MySQL: why is a query using a VIEW less efficient compared to a query directly using the view's underlying JOIN?

I have three tables, bug , bugrule and bugtrace , for which relationships are:

bug     1--------N  bugrule
        id = bugid

bugrule 0---------N bugtrace
        id = ruleid

Because I'm almost always interested in relations between bug <---> bugtrace I have created an appropriate VIEW which is used as part of several queries. Interestingly, queries using this VIEW have significantly worse performance than equivalent queries using the underlying JOIN explicitly.

VIEW definition:

CREATE VIEW bugtracev AS
  SELECT t.*, r.bugid
      FROM bugtrace AS t
      LEFT JOIN bugrule AS r ON t.ruleid=r.id
    WHERE r.version IS NULL

Execution plan for a query using the VIEW (bad performance):

mysql> explain 
      SELECT c.id,state,
             (SELECT COUNT(DISTINCT(t.id)) FROM bugtracev AS t 
               WHERE t.bugid=c.id) 
       FROM bug AS c 
      WHERE c.version IS NULL
        AND c.id<10;
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
| id | select_type        | table | type  | possible_keys | key    | key_len | ref             | rows    | Extra                 |
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
|  1 | PRIMARY            | c     | range | id_2,id       | id_2   | 8       | NULL            |       3 | Using index condition |
|  2 | DEPENDENT SUBQUERY | t     | index | NULL          | ruleid | 9       | NULL            | 1426004 | Using index           |
|  2 | DEPENDENT SUBQUERY | r     | ref   | id_2,id       | id_2   | 8       | bugapp.t.ruleid |       1 | Using where           |
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
3 rows in set (0.00 sec)

Execution plan for a query using the underlying JOIN directly (good performance):

mysql> explain 
       SELECT c.id,state,
              (SELECT COUNT(DISTINCT(t.id)) 
                 FROM bugtrace AS t
                 LEFT JOIN bugrule AS r ON t.ruleid=r.id 
                WHERE r.version IS NULL
                  AND r.bugid=c.id) 
        FROM bug AS c 
       WHERE c.version IS NULL
         AND c.id<10;
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
| id | select_type        | table | type  | possible_keys | key    | key_len | ref         | rows   | Extra                 |
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
|  1 | PRIMARY            | c     | range | id_2,id       | id_2   | 8       | NULL        |      3 | Using index condition |
|  2 | DEPENDENT SUBQUERY | r     | ref   | id_2,id,bugid | bugid  | 8       | bugapp.c.id |      1 | Using where           |
|  2 | DEPENDENT SUBQUERY | t     | ref   | ruleid        | ruleid | 9       | bugapp.r.id | 713002 | Using index           |
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
3 rows in set (0.00 sec)

CREATE TABLE statements (reduced by irrelevant columns) are:

mysql> show create table bug;
CREATE TABLE `bug` (
  `id` bigint(20) NOT NULL,
  `version` int(11) DEFAULT NULL,
  `state` varchar(16) DEFAULT NULL,
  UNIQUE KEY `id_2` (`id`,`version`),
  KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

mysql> show create table bugrule;
CREATE TABLE `bugrule` (
  `id` bigint(20) NOT NULL,
  `version` int(11) DEFAULT NULL,
  `bugid` bigint(20) NOT NULL,
  UNIQUE KEY `id_2` (`id`,`version`),
  KEY `id` (`id`),
  KEY `bugid` (`bugid`),
  CONSTRAINT `bugrule_ibfk_1` FOREIGN KEY (`bugid`) REFERENCES `bug` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

mysql> show create table bugtrace;
CREATE TABLE `bugtrace` (
  `id` bigint(20) NOT NULL,
  `ruleid` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `ruleid` (`ruleid`),
  CONSTRAINT `bugtrace_ibfk_1` FOREIGN KEY (`ruleid`) REFERENCES `bugrule` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

You ask why about query optimization for a couple of complex queries with COUNT(DISTINCT val) and dependent subqueries. It's hard to know why for sure.

You will probably fix most of your performance problem by getting rid of your dependent subquery, though. Try something like this:

 SELECT c.id,state, cnt.cnt
   FROM bug AS c
   LEFT JOIN (
            SELECT bugid, COUNT(DISTINCT id) cnt
              FROM bugtracev 
             GROUP BY bugid
        ) cnt ON c.id = cnt.bugid
  WHERE c.version IS NULL
    AND c.id<10;

Why does this help? To satisfy the query the optimizer can choose to run the GROUP BY subquery just once, rather than many times. And, you can use EXPLAIN on the GROUP BY subquery to understand its performance.

You may also get a performance boost by creating a compound index on bugrule that matches the query in your view. Try this one.

 CREATE INDEX bugrule_v ON bugrule (version, ruleid, bugid)

and try switching the last two columns like so

 CREATE INDEX bugrule_v ON bugrule (version, ruleid, bugid)

These indexes are called covering indexes because they contain all the columns needed to satisfy your query. version appears first because that helps optimize WHERE version IS NULL in your view definition. That makes it faster.

Pro tip: Avoid using SELECT * in views and queries, especially when you have performance problems. Instead, list the columns you actually need. The * may force the query optimizer to avoid a covering index, even when the index would help.

When using MySQL 5.6 (or older), try with at least MySQL 5.7. According to What's New in MySQL 5.7? :

We have to a large extent unified the handling of derived tables and views. Until now, subqueries in the FROM clause (derived tables) were unconditionally materialized, while views created from the same query expressions were sometimes materialized and sometimes merged into the outer query. This behavior, beside being inconsistent, can lead to a serious performance penalty.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM