繁体   English   中英

CROSS JOIN + LEFT JOIN子查询的替代策略?

[英]Alternative strategy to CROSS JOIN + LEFT JOIN subquery?

我想加入一个有时间单位的表(注意:这些不是连续的)

Time 1
Time 2

......有一个部门表......

Department 1
Department 2

...为了与观察表匹配,但只选择X型的那些......

Time unit     Department id       Observation    Type
Time 1        Department 1        6               X
Time 2        Department 2        5               X
Time 2        Department 2        4               Y

...最终得到一个像这样的表 - 缺少的观察用0或NULL填充

Time unit     Department id     Observation
Time 1        Department 1        6
Time 2        Department 1        0
Time 1        Department 2        0
Time 2        Department 2        5

这样做,但它很慢,所以我确信必须有比以下更好的方法?

SELECT timeunits.time_unit, departments.department_id, observations.observation 
FROM timeunits
CROSS JOIN departments
LEFT JOIN   (
    SELECT observations.time_unit, observations.department_id, observations.observation 
    FROM observations
    WHERE observations.type='X'
    ) as observations
ON timeunits.time_unit=observations.time_unit 
AND departments.department_id=observations.department_id

说明:

+----+-------------+--------------+-------+---------------+-------------+---------+---------------------------------------------+--------+----------------------------------------------------+
| id | select_type | table        | type  | possible_keys | key         | key_len | ref                                         |  rows  | Extra                                              |
+----+-------------+--------------+-------+---------------+-------------+---------+---------------------------------------------+--------+----------------------------------------------------+
|  1 | PRIMARY     | time_units   | ALL   | NULL          | NULL        | NULL    | NULL                                        |    200 | NULL                                               |
|  1 | PRIMARY     | departments  | index | NULL          | PRIMARY     | 4       | NULL                                        |    500 | Using index; Using join buffer (Block Nested Loop) |
|  1 | PRIMARY     | <derived2>   | ref   | <auto_key0>   | <auto_key0> | 263     | observations.time_units.time_unit,          |        |                                                    |
|    |             |              |       |               |             |         | observations.departments.department_id      |    600 | Using where                                        |
|  2 | DERIVED     | observations | ref   | type          | type        | 258     | const                                       | 100000 | Using index condition                              |
+----+-------------+--------------+-------+---------------+-------------+---------+---------------------------------------------+--------+----------------------------------------------------+

我们已经看到type = 'X'observations并不太常见。

像这样直接消除子查询:

SELECT timeunits.time_unit, departments.department_id, observations.observation
FROM timeunits
  JOIN departments
  LEFT JOIN observations ON observations.time_unit = timeunits.time_unit 
    AND observations.department_id = departments.department_id
    AND observations.type = 'X'

导致更高的执行时间,因为MySQL目前只能在observations中的一列上使用索引。 只要这不是type ,就会连接完整的observations表(因为我们显式查询所有department_idtime_unit组合),然后将删除具有不同type的列=>全表扫描。

对无子查询语句的可能优化将是组合索引。 理想情况下,在department_idtime_unittype这三个在连接条件中与equals一起使用。 为了减少存储开销,我们可以(并且可能应该)省略排除最少数据的列。

如果我们计划稍后选择例如time_unit范围,我们应该将此列放在索引的最后,以便能够最好地使用索引。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM