简体   繁体   中英

Does MySQL optimize selected aggregations on related tables to avoid N+1?

Is this query optimal in MySQL? I mean: Is there a constant amount of queries being executed?

OR does it fall in the N+1 problem? Found nothing too detailed in the official MySQL docs regarding optimization.

SELECT t.*, (SELECT COUNT(1) from related_table rt where rt.t_id = t.id)
FROM table t

In a naive sight, there's a query and N queries, so it would fall in the N+1 problem.

Does MySQL 5.5+ automagically+internally improve this query to make a constant number of queries? perhaps transforming it internally to something like:

SELECT t.*, COUNT(rt.id)
FROM table t LEFT OUTER JOIN related_table rt
GROUP BY t.id

I mean: I know how to improve it by hand, but I'm asking this because:

  1. Perhaps making an apportation to a framework with an (somehow incomplete IMHO) ORM via a library.
  2. Curiosity. Found not so much documentation in the official MySQL docs.

No, a correlated subquery in the select-list is not optimized out by MySQL.

You can confirm this by using EXPLAIN to get a report of the optimization plan. Here's a similar query using a test database:

mysql> explain select *, (SELECT COUNT(*) FROM cast_info where cast_info.role_id = role_type.id) AS c 
    from role_type\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: role_type
   partitions: NULL
         type: index
possible_keys: NULL
          key: role
      key_len: 98
          ref: NULL
         rows: 12
     filtered: 100.00
        Extra: Using index
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: cast_info
   partitions: NULL
         type: ref
possible_keys: cr
          key: cr
      key_len: 4
          ref: imdb.role_type.id
         rows: 2534411
     filtered: 100.00
        Extra: Using index

The select type of DEPENDENT SUBQUERY means that the subquery will be executed many times, probably once for each row of the outer query.

Compare with the EXPLAIN for the manually optimized query:

mysql> explain select r.*, COUNT(c.id) AS c from role_type AS r  left outer join cast_info as c on r.id = c.role_id group by r.id\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: r
   partitions: NULL
         type: index
possible_keys: PRIMARY,role
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 12
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: c
   partitions: NULL
         type: ref
possible_keys: cr
          key: cr
      key_len: 4
          ref: imdb.r.id
         rows: 2534411
     filtered: 100.00
        Extra: Using index

This shows the access to the second table is just a simple join by reference.

You can also test with the MySQL query profiler . Here's the second query, that uses join:

+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000167 |
| checking permissions | 0.000015 |
| checking permissions | 0.000016 |
| Opening tables       | 0.000050 |
| init                 | 0.000059 |
| System lock          | 0.000044 |
| optimizing           | 0.000011 |
| statistics           | 0.000151 |
| preparing            | 0.000099 |
| Sorting result       | 0.000019 |
| executing            | 0.000010 |
| Sending data         | 9.700879 |
| end                  | 0.000024 |
| query end            | 0.000022 |
| closing tables       | 0.000017 |
| freeing items        | 0.000243 |
| cleaning up          | 0.000056 |
+----------------------+----------+

And here's the one with the dependent subquery:

+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000152 |
| checking permissions | 0.000014 |
| checking permissions | 0.000013 |
| Opening tables       | 0.000050 |
| init                 | 0.000067 |
| System lock          | 0.000042 |
| optimizing           | 0.000010 |
| statistics           | 0.000367 |
| preparing            | 0.000033 |
| optimizing           | 0.000015 |
| statistics           | 0.000032 |
| preparing            | 0.000020 |
| executing            | 0.000010 |
| Sending data         | 0.000191 |
| executing            | 0.000010 |
| Sending data         | 4.103899 |
| executing            | 0.000018 |
| Sending data         | 2.413570 |
| executing            | 0.000018 |
| Sending data         | 0.043924 |
| executing            | 0.000022 |
| Sending data         | 0.037834 |
| executing            | 0.000020 |
| Sending data         | 0.014127 |
| executing            | 0.000021 |
| Sending data         | 0.089977 |
| executing            | 0.000023 |
| Sending data         | 0.045968 |
| executing            | 0.000024 |
| Sending data         | 0.000044 |
| executing            | 0.000005 |
| Sending data         | 0.190935 |
| executing            | 0.000034 |
| Sending data         | 1.046394 |
| executing            | 0.000018 |
| Sending data         | 0.017567 |
| executing            | 0.000021 |
| Sending data         | 0.882959 |
| end                  | 0.000046 |
| query end            | 0.000023 |
| closing tables       | 0.000018 |
| freeing items        | 0.000248 |
| cleaning up          | 0.000025 |
+----------------------+----------+

You can see that the subquery causes multiple executions. In my case, I had just a few rows in the role_type table, but if you have hundreds or thousands, the number of subquery executions can get so long that the profiler truncates that report.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM