[英]MySQL very slow query
我的表格包含以下列:
gamelogs_id (auto_increment primary key)
player_id (int)
player_name (varchar)
game_id (int)
season_id (int)
points (int)
該表具有以下索引
+-----------------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| player_gamelogs | 0 | PRIMARY | 1 | player_gamelogs_id | A | 371330 | NULL | NULL | | BTREE | | |
| player_gamelogs | 1 | player_name | 1 | player_name | A | 3375 | NULL | NULL | YES | BTREE | | |
| player_gamelogs | 1 | points | 1 | points | A | 506 | NULL | NULL | YES | BTREE | ## Heading ##| |
| player_gamelogs | 1 | game_id | 1 | game_id | A | 37133 | NULL | NULL | YES | BTREE | | |
| player_gamelogs | 1 | season | 1 | season | A | 30 | NULL | NULL | YES | BTREE | | |
| player_gamelogs | 1 | team_abbreviation | 1 | team_abbreviation | A | 70 | NULL | NULL | YES | BTREE | | |
| player_gamelogs | 1 | player_id | 1 | game_id | A | 41258 | NULL | NULL | YES | BTREE | | |
| player_gamelogs | 1 | player_id | 2 | player_id | A | 371330 | NULL | NULL | YES | BTREE | | |
| player_gamelogs | 1 | player_id | 3 | dk_points | A | 371330 | NULL | NULL | YES | BTREE | | |
| player_gamelogs | 1 | game_player_season | 1 | game_id | A | 41258 | NULL | NULL | YES | BTREE | | |
| player_gamelogs | 1 | game_player_season | 2 | player_id | A | 371330 | NULL | NULL | YES | BTREE | | |
| player_gamelogs | 1 | game_player_season | 3 | season_id | A | 371330 | NULL | NULL | | BTREE | | |
+-----------------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
我試圖在比賽開始之前計算一個賽季和球員的積分平均值。 因此,對於本賽季的第3場比賽,avg_points將是游戲1和2的平均值。游戲數量按順序排列,使得較早的游戲比較晚的游戲少。 我也可以選擇使用日期字段,但我認為數字比較會更快?
我的查詢如下:
SELECT game_id,
player_id,
player_name,
(SELECT avg(points)
FROM player_gamelogs t2
WHERE t2.game_id < t1.game_id
AND t1.player_id = t2.player_id
AND t1.season_id = t2.season_id) AS avg_points
FROM player_gamelogs t1
ORDER BY player_name, game_id;
EXPLAIN生成以下輸出:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+--------------------------------------+------+---------+------+--------+-------------------------------------------------+
| 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 371330 | Using filesort |
| 2 | DEPENDENT SUBQUERY | t2 | ALL | game_id,player_id,game_player_season | NULL | NULL | NULL | 371330 | Range checked for each record (index map: 0xC8) |
我不確定這是因為涉及的任務的性質還是因為我的查詢效率低下。 謝謝你的任何建議!
請考慮以下查詢:
SELECT t1.season_id, t1.game_id, t1.player_id, t1.player_name, AVG(COALESCE(t2.points, 0)) AS average_player_points
FROM player_gamelogs t1
LEFT JOIN player_gamelogs t2 ON
t1.game_id > t2.game_id
AND t1.player_id = t2.player_id
AND t1.season_id = t2.season_id
GROUP BY
t1.season_id, t1.game_id, t1.player_id, t1.player_name
ORDER BY t1.player_name, t1.game_id;
筆記:
Group by
已分組的列進行分組。 如果可以,請避免事后訂購,因為它會產生無用的開銷。 正如評論中所述,這不是一種官方行為,並且假設其隨時間的一致性的結果應該考慮與突然失去分類的風險。 你的查詢寫得很好:
SELECT game_id, player_id, player_name,
(SELECT avg(t2.points)
FROM player_gamelogs t2
WHERE t2.game_id < t1.game_id AND
t1.player_id = t2.player_id AND
t1.season_id = t2.season_id
) AS avg_points
FROM player_gamelogs t1
ORDER BY player_name, game_id;
但是,為了獲得最佳性能,您需要兩個復合索引: (player_id, season_id, game_id, points)
和(player_name, game_id, season_id)
。
第一個索引應該加速子查詢。 第二個是外部order by
。
正如您現在的查詢一樣,您正在為每個玩家運行每個游戲及其下的所有游戲...例如,如果您每人有10個游戲,則每個季節/人獲得以下結果
Game 10, Game 10 points, avg of games 1-9
Game 9, Game 9 points, avg of games 1-8...
...
...
Game 2, Game 2 points, avg of thus final game 1 only.
你聲明你想要最新的游戲,其中包含一切的平均值。 也就是說,我假設你並不關心每個人的每個較低的游戲關卡。
您還在進行涵蓋所有季節的查詢。 如果一個季節結束,你關心舊季節嗎? 或者只是當前的季節。 否則你將經歷所有賽季,所有球員......
總而言之,我提供以下內容。 首先,使用WHERE子句將查詢限制為最新季節,但我特意將季節留在查詢/組中,以防您想要其他季節。 然后,我將給定人/季的MAXIMUM游戲作為最后1行(每人季節)的基線,然后得到其下的所有內容的平均值。 因此,在10場比賽的場景樣本中,我將不會抓住9-2的基礎行,只是按照我的場景返回#10游戲。
select
pgMax.Player_ID,
pgMax.Season_ID,
pgMax.mostRecentGameID,
pgl3.points as mostRecentGamePoints,
pgl3.player_name,
coalesce( avg( pgl2.points ), 0 ) as AvgPointsPriorToCurrentGame
from
( select pgl1.player_id,
pgl1.season_id,
max( pgl1.game_id ) as mostRecentGameID
from
player_gameLogs pgl1
where
pgl1.season_id = JustOneSeason
group by
pgl1.player_id,
pgl1.season_id ) pgMax
JOIN player_gamelogs pgl pgl2
on pgMax.player_id = pgl2.player_id
AND pgMax.season_id = pgl2.season_id
AND pgMax.mostRecentGameID > pgl2.game_id
JOIN player_gamelogs pgl pgl3
on pgMax.player_id = pgl3.player_id
AND pgMax.season_id = pgl3.season_id
AND pgMax.mostRecentGameID = pgl3.game_id
group by
pgMax.Player_ID,
pgMax.Season_ID
order by
pgMax.Player_ID
現在,為了優化查詢,綜合索引最好(player_id,season_id,game_id,points)。 但是,如果您只是尋找“當前季節”的任何內容,那么讓您的索引(season_id,player_id,game_id,points)將SEASON ID放在第一位置以預先認證WHERE子句。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.