MySQL查询速度很慢

Question

我的表格包含以下列：

gamelogs_id (auto_increment primary key)
player_id (int)
player_name (varchar)
game_id (int)
season_id (int)
points (int)

该表具有以下索引

+-----------------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name           | Seq_in_index | Column_name        | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| player_gamelogs |          0 | PRIMARY            |            1 | player_gamelogs_id | A         |      371330 |     NULL | NULL   |      | BTREE      |         |               |
| player_gamelogs |          1 | player_name        |            1 | player_name        | A         |        3375 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | points          |            1 | points          | A         |         506 |     NULL | NULL   | YES  | BTREE      |         ## Heading ##|               |
| player_gamelogs |          1 | game_id            |            1 | game_id            | A         |       37133 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | season             |            1 | season             | A         |          30 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | team_abbreviation  |            1 | team_abbreviation  | A         |          70 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | player_id          |            1 | game_id            | A         |       41258 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | player_id          |            2 | player_id          | A         |      371330 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | player_id          |            3 | dk_points          | A         |      371330 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | game_player_season |            1 | game_id            | A         |       41258 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | game_player_season |            2 | player_id          | A         |      371330 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | game_player_season |            3 | season_id          | A         |      371330 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

我试图在比赛开始之前计算一个赛季和球员的积分平均值。 因此，对于本赛季的第3场比赛，avg_points将是游戏1和2的平均值。游戏数量按顺序排列，使得较早的游戏比较晚的游戏少。 我也可以选择使用日期字段，但我认为数字比较会更快？

我的查询如下：

SELECT game_id, 
       player_id, 
       player_name, 
       (SELECT avg(points) 
          FROM player_gamelogs t2
         WHERE t2.game_id < t1.game_id 
           AND t1.player_id = t2.player_id 
           AND t1.season_id = t2.season_id) AS avg_points
  FROM player_gamelogs t1
 ORDER BY player_name, game_id;

EXPLAIN生成以下输出：

| id | select_type        | table | type | possible_keys                        | key  | key_len | ref  | rows   | Extra                                           |
+----+--------------------+-------+------+--------------------------------------+------+---------+------+--------+-------------------------------------------------+
|  1 | PRIMARY            | t1    | ALL  | NULL                                 | NULL | NULL    | NULL | 371330 | Using filesort                                  |
|  2 | DEPENDENT SUBQUERY | t2    | ALL  | game_id,player_id,game_player_season | NULL | NULL    | NULL | 371330 | Range checked for each record (index map: 0xC8) |

我不确定这是因为涉及的任务的性质还是因为我的查询效率低下。 谢谢你的任何建议！

Answer 1

请考虑以下查询：

SELECT t1.season_id, t1.game_id, t1.player_id, t1.player_name, AVG(COALESCE(t2.points, 0)) AS average_player_points
FROM player_gamelogs t1
        LEFT JOIN player_gamelogs t2 ON 
                t1.game_id > t2.game_id 
            AND t1.player_id = t2.player_id
            AND t1.season_id = t2.season_id 
GROUP BY
    t1.season_id, t1.game_id, t1.player_id, t1.player_name
ORDER BY t1.player_name, t1.game_id;

笔记：

要以最佳方式执行，您需要一个额外的索引（season_id，game_id，player_id，player_name）
更好的是，将播放器表从id中检索名称。 对我来说，我们必须从日志表中获取播放器名称，而且如果它在索引中是必需的，这似乎是多余的。
Group by已分组的列进行分组。 如果可以，请避免事后订购，因为它会产生无用的开销。 正如评论中所述，这不是一种官方行为，并且假设其随时间的一致性的结果应该考虑与突然失去分类的风险。

Answer 2

你的查询写得很好：

SELECT game_id, player_id, player_name, 
       (SELECT avg(t2.points) 
        FROM player_gamelogs t2
        WHERE t2.game_id < t1.game_id AND
              t1.player_id = t2.player_id AND
              t1.season_id = t2.season_id
      ) AS avg_points
FROM player_gamelogs t1
ORDER BY player_name, game_id;

但是，为了获得最佳性能，您需要两个复合索引： (player_id, season_id, game_id, points)和(player_name, game_id, season_id) 。

第一个索引应该加速子查询。 第二个是外部order by 。

Answer 3

正如您现在的查询一样，您正在为每个玩家运行每个游戏及其下的所有游戏...例如，如果您每人有10个游戏，则每个季节/人获得以下结果

Game 10, Game 10 points, avg of games 1-9
Game 9, Game 9 points, avg of games 1-8...
...
...
Game 2, Game 2 points, avg of thus final game 1 only.

你声明你想要最新的游戏，其中包含一切的平均值。 也就是说，我假设你并不关心每个人的每个较低的游戏关卡。

您还在进行涵盖所有季节的查询。 如果一个季节结束，你关心旧季节吗？ 或者只是当前的季节。 否则你将经历所有赛季，所有球员......

总而言之，我提供以下内容。 首先，使用WHERE子句将查询限制为最新季节，但我特意将季节留在查询/组中，以防您想要其他季节。 然后，我将给定人/季的MAXIMUM游戏作为最后1行（每人季节）的基线，然后得到其下的所有内容的平均值。 因此，在10场比赛的场景样本中，我将不会抓住9-2的基础行，只是按照我的场景返回＃10游戏。

select
      pgMax.Player_ID,
      pgMax.Season_ID,
      pgMax.mostRecentGameID,
      pgl3.points as mostRecentGamePoints,
      pgl3.player_name,
      coalesce( avg( pgl2.points ), 0 ) as AvgPointsPriorToCurrentGame
   from
      ( select pgl1.player_id,
               pgl1.season_id,
               max( pgl1.game_id ) as mostRecentGameID
           from
              player_gameLogs pgl1
           where
               pgl1.season_id = JustOneSeason
           group by
              pgl1.player_id,
              pgl1.season_id ) pgMax

         JOIN player_gamelogs pgl pgl2
            on pgMax.player_id = pgl2.player_id
           AND pgMax.season_id = pgl2.season_id
           AND pgMax.mostRecentGameID > pgl2.game_id

         JOIN player_gamelogs pgl pgl3
            on pgMax.player_id = pgl3.player_id
           AND pgMax.season_id = pgl3.season_id
           AND pgMax.mostRecentGameID = pgl3.game_id
   group by
      pgMax.Player_ID,
      pgMax.Season_ID
   order by
      pgMax.Player_ID

现在，为了优化查询，综合索引最好（player_id，season_id，game_id，points）。 但是，如果您只是寻找“当前季节”的任何内容，那么让您的索引（season_id，player_id，game_id，points）将SEASON ID放在第一位置以预先认证WHERE子句。

MySQL查询速度很慢

问题描述

3 个解决方案

解决方案1
7 已采纳 2015-12-31 00:41:23

解决方案2
2 2015-12-31 00:54:09

解决方案3
1 2015-12-31 01:06:07

MySQL查询速度很慢

问题描述

3 个解决方案

解决方案1 7 已采纳 2015-12-31 00:41:23

解决方案2 2 2015-12-31 00:54:09

解决方案3 1 2015-12-31 01:06:07

解决方案1
7 已采纳 2015-12-31 00:41:23

解决方案2
2 2015-12-31 00:54:09

解决方案3
1 2015-12-31 01:06:07