简体   繁体   English

MySQL:与直接使用视图的底层 JOIN 的查询相比,为什么使用 VIEW 的查询效率较低?

[英]MySQL: why is a query using a VIEW less efficient compared to a query directly using the view's underlying JOIN?

I have three tables, bug , bugrule and bugtrace , for which relationships are:我有三个表, bugbugrulebugtrace ,它们的关系是:

bug     1--------N  bugrule
        id = bugid

bugrule 0---------N bugtrace
        id = ruleid

Because I'm almost always interested in relations between bug <---> bugtrace I have created an appropriate VIEW which is used as part of several queries.因为我几乎总是对bug <---> bugtrace之间的关系感兴趣, bug <---> bugtrace我创建了一个适当的VIEW ,用作多个查询的一部分。 Interestingly, queries using this VIEW have significantly worse performance than equivalent queries using the underlying JOIN explicitly.有趣的是,使用此VIEW查询比显式使用底层JOIN等效查询的性能要差得多。

VIEW definition: VIEW定义:

CREATE VIEW bugtracev AS
  SELECT t.*, r.bugid
      FROM bugtrace AS t
      LEFT JOIN bugrule AS r ON t.ruleid=r.id
    WHERE r.version IS NULL

Execution plan for a query using the VIEW (bad performance):使用VIEW的查询的执行计划(性能不佳):

mysql> explain 
      SELECT c.id,state,
             (SELECT COUNT(DISTINCT(t.id)) FROM bugtracev AS t 
               WHERE t.bugid=c.id) 
       FROM bug AS c 
      WHERE c.version IS NULL
        AND c.id<10;
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
| id | select_type        | table | type  | possible_keys | key    | key_len | ref             | rows    | Extra                 |
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
|  1 | PRIMARY            | c     | range | id_2,id       | id_2   | 8       | NULL            |       3 | Using index condition |
|  2 | DEPENDENT SUBQUERY | t     | index | NULL          | ruleid | 9       | NULL            | 1426004 | Using index           |
|  2 | DEPENDENT SUBQUERY | r     | ref   | id_2,id       | id_2   | 8       | bugapp.t.ruleid |       1 | Using where           |
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
3 rows in set (0.00 sec)

Execution plan for a query using the underlying JOIN directly (good performance):直接使用底层JOIN的查询执行计划(性能好):

mysql> explain 
       SELECT c.id,state,
              (SELECT COUNT(DISTINCT(t.id)) 
                 FROM bugtrace AS t
                 LEFT JOIN bugrule AS r ON t.ruleid=r.id 
                WHERE r.version IS NULL
                  AND r.bugid=c.id) 
        FROM bug AS c 
       WHERE c.version IS NULL
         AND c.id<10;
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
| id | select_type        | table | type  | possible_keys | key    | key_len | ref         | rows   | Extra                 |
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
|  1 | PRIMARY            | c     | range | id_2,id       | id_2   | 8       | NULL        |      3 | Using index condition |
|  2 | DEPENDENT SUBQUERY | r     | ref   | id_2,id,bugid | bugid  | 8       | bugapp.c.id |      1 | Using where           |
|  2 | DEPENDENT SUBQUERY | t     | ref   | ruleid        | ruleid | 9       | bugapp.r.id | 713002 | Using index           |
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
3 rows in set (0.00 sec)

CREATE TABLE statements (reduced by irrelevant columns) are: CREATE TABLE语句(由不相关的列减少)是:

mysql> show create table bug;
CREATE TABLE `bug` (
  `id` bigint(20) NOT NULL,
  `version` int(11) DEFAULT NULL,
  `state` varchar(16) DEFAULT NULL,
  UNIQUE KEY `id_2` (`id`,`version`),
  KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

mysql> show create table bugrule;
CREATE TABLE `bugrule` (
  `id` bigint(20) NOT NULL,
  `version` int(11) DEFAULT NULL,
  `bugid` bigint(20) NOT NULL,
  UNIQUE KEY `id_2` (`id`,`version`),
  KEY `id` (`id`),
  KEY `bugid` (`bugid`),
  CONSTRAINT `bugrule_ibfk_1` FOREIGN KEY (`bugid`) REFERENCES `bug` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

mysql> show create table bugtrace;
CREATE TABLE `bugtrace` (
  `id` bigint(20) NOT NULL,
  `ruleid` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `ruleid` (`ruleid`),
  CONSTRAINT `bugtrace_ibfk_1` FOREIGN KEY (`ruleid`) REFERENCES `bugrule` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

You ask why about query optimization for a couple of complex queries with COUNT(DISTINCT val) and dependent subqueries.您会问为什么要对具有COUNT(DISTINCT val)和相关子查询的几个复杂查询进行查询优化。 It's hard to know why for sure.很难确定为什么

You will probably fix most of your performance problem by getting rid of your dependent subquery, though.不过,您可能会通过摆脱依赖子查询来解决大部分性能问题。 Try something like this:尝试这样的事情:

 SELECT c.id,state, cnt.cnt
   FROM bug AS c
   LEFT JOIN (
            SELECT bugid, COUNT(DISTINCT id) cnt
              FROM bugtracev 
             GROUP BY bugid
        ) cnt ON c.id = cnt.bugid
  WHERE c.version IS NULL
    AND c.id<10;

Why does this help?为什么这有帮助? To satisfy the query the optimizer can choose to run the GROUP BY subquery just once, rather than many times.为了满足查询,优化器可以选择只运行一次GROUP BY子查询,而不是多次。 And, you can use EXPLAIN on the GROUP BY subquery to understand its performance.而且,您可以在GROUP BY子查询上使用EXPLAIN来了解其性能。

You may also get a performance boost by creating a compound index on bugrule that matches the query in your view.您还可以通过在bugrule上创建与视图中的查询匹配的复合索引来提高性能。 Try this one.试试这个。

 CREATE INDEX bugrule_v ON bugrule (version, ruleid, bugid)

and try switching the last two columns like so并尝试像这样切换最后两列

 CREATE INDEX bugrule_v ON bugrule (version, ruleid, bugid)

These indexes are called covering indexes because they contain all the columns needed to satisfy your query.这些索引称为覆盖索引,因为它们包含满足查询所需的所有列。 version appears first because that helps optimize WHERE version IS NULL in your view definition. version首先出现,因为这有助于优化视图定义中的WHERE version IS NULL That makes it faster.这使它更快。

Pro tip: Avoid using SELECT * in views and queries, especially when you have performance problems.专业提示:避免在视图和查询中使用SELECT * ,尤其是当您遇到性能问题时。 Instead, list the columns you actually need.相反,列出您实际需要的列。 The * may force the query optimizer to avoid a covering index, even when the index would help. *可能会强制查询优化器避免覆盖索引,即使索引会有所帮助。

When using MySQL 5.6 (or older), try with at least MySQL 5.7.使用 MySQL 5.6(或更早版本)时,请尝试至少使用 MySQL 5.7。 According to What's New in MySQL 5.7?根据MySQL 5.7 中的新增功能? :

We have to a large extent unified the handling of derived tables and views.我们必须在很大程度上统一对派生表和视图的处理。 Until now, subqueries in the FROM clause (derived tables) were unconditionally materialized, while views created from the same query expressions were sometimes materialized and sometimes merged into the outer query.到目前为止,FROM 子句(派生表)中的子查询是无条件物化的,而从相同查询表达式创建的视图有时会被物化,有时会合并到外部查询中。 This behavior, beside being inconsistent, can lead to a serious performance penalty.这种行为除了不一致之外,还会导致严重的性能损失。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM