[英]Optimizing slow MySQL select query
編輯 :在看了這里的一些答案和研究時間之后,我的團隊得出結論,很可能沒有辦法比我們能夠實現的4.5秒更優化這一點(除非可能在offers_clicks上進行分區,但是會有一些丑陋的副作用)。 最后,經過大量的頭腦風暴,我們決定拆分兩個查詢,創建兩組用戶ID(一個來自users表,一個來自offers_clicks),並將它們與Python中的set進行比較。 來自users表的id組仍然是從SQL中提取的,但我們決定將offers_clicks移動到Lucene並在其上添加了一些緩存,這樣就可以從中獲取另一組ID了。 最終的結果是緩存下降到大約半秒,沒有緩存時下降到0.9秒。
原始帖子開始:我無法優化查詢。 第一個版本的查詢很好,但是在第二個查詢中加入了offers_clicks,查詢變得相當慢。 Users表包含1000萬行,offers_clicks包含5300萬行。
可接受的表現:
SELECT count(distinct(users.id)) AS count_1
FROM users USE index (country_2)
WHERE users.country = 'US'
AND users.last_active > '2015-02-26';
1 row in set (0.35 sec)
壞:
SELECT count(distinct(users.id)) AS count_1
FROM offers_clicks USE index (user_id_3), users USE index (country_2)
WHERE users.country = 'US'
AND users.last_active > '2015-02-26'
AND offers_clicks.user_id = users.id
AND offers_clicks.date > '2015-02-14'
AND offers_clicks.ranking_score < 3.49
AND offers_clicks.ranking_score > 0.24;
1 row in set (7.39 sec)
以下是它的外觀而不指定任何索引(甚至更糟):
SELECT count(distinct(users.id)) AS count_1
FROM offers_clicks, users
WHERE users.country IN ('US')
AND users.last_active > '2015-02-26'
AND offers_clicks.user_id = users.id
AND offers_clicks.date > '2015-02-14'
AND offers_clicks.ranking_score < 3.49
AND offers_clicks.ranking_score > 0.24;
1 row in set (17.72 sec)
說明:
explain SELECT count(distinct(users.id)) AS count_1 FROM offers_clicks USE index (user_id_3), users USE index (country_2) WHERE users.country IN ('US') AND users.last_active > '2015-02-26' AND offers_clicks.user_id = users.id AND offers_clicks.date > '2015-02-14' AND offers_clicks.ranking_score < 3.49 AND offers_clicks.ranking_score > 0.24;
+----+-------------+---------------+-------+---------------+-----------+---------+------------------------------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+---------------+-----------+---------+------------------------------+--------+--------------------------+
| 1 | SIMPLE | users | range | country_2 | country_2 | 14 | NULL | 245014 | Using where; Using index |
| 1 | SIMPLE | offers_clicks | ref | user_id_3 | user_id_3 | 4 | dejong_pointstoshop.users.id | 270153 | Using where; Using index |
+----+-------------+---------------+-------+---------------+-----------+---------+------------------------------+--------+--------------------------+
解釋而不指定任何索引:
mysql> explain SELECT count(distinct(users.id)) AS count_1 FROM offers_clicks, users WHERE users.country IN ('US') AND users.last_active > '2015-02-26' AND offers_clicks.user_id = users.id AND offers_clicks.date > '2015-02-14' AND offers_clicks.ranking_score < 3.49 AND offers_clicks.ranking_score > 0.24;
+----+-------------+---------------+-------+------------------------------------------------------------------------+-----------+---------+------------------------------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+------------------------------------------------------------------------+-----------+---------+------------------------------+--------+--------------------------+
| 1 | SIMPLE | users | range | PRIMARY,last_active,country,last_active_2,country_2 | country_2 | 14 | NULL | 221606 | Using where; Using index |
| 1 | SIMPLE | offers_clicks | ref | user_id,user_id_2,date,date_2,date_3,ranking_score,user_id_3,user_id_4 | user_id_2 | 4 | dejong_pointstoshop.users.id | 3 | Using where |
+----+-------------+---------------+-------+------------------------------------------------------------------------+-----------+---------+------------------------------+--------+--------------------------+
這是我嘗試過的一大堆索引並沒有太多成功:
+---------------+------------+-----------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+-----------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| offers_clicks | 1 | user_id_3 | 1 | user_id | A | 198 | NULL | NULL | | BTREE | | |
| offers_clicks | 1 | user_id_3 | 2 | ranking_score | A | 198 | NULL | NULL | | BTREE | | |
| offers_clicks | 1 | user_id_3 | 3 | date | A | 198 | NULL | NULL | | BTREE | | |
| offers_clicks | 1 | user_id_2 | 1 | user_id | A | 17838712 | NULL | NULL | | BTREE | | |
| offers_clicks | 1 | user_id_2 | 2 | date | A | 53516137 | NULL | NULL | | BTREE | | |
| offers_clicks | 1 | user_id_4 | 1 | user_id | A | 198 | NULL | NULL | | BTREE | | |
| offers_clicks | 1 | user_id_4 | 2 | date | A | 198 | NULL | NULL | | BTREE | | |
| offers_clicks | 1 | user_id_4 | 3 | ranking_score | A | 198 | NULL | NULL | | BTREE | | |
| users | 1 | country_2 | 1 | country | A | 14 | NULL | NULL | | BTREE | | |
| users | 1 | country_2 | 2 | last_active | A | 8048529 | NULL | NULL | | BTREE | | |
簡化的用戶架構:
+---------------------------------+---------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------------------+---------------+------+-----+---------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| country | char(2) | NO | MUL | | |
| last_active | datetime | NO | MUL | 2000-01-01 00:00:00 | |
簡化提供點擊模式:
+-----------------+------------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_id | int(11) | NO | MUL | 0 | |
| offer_id | int(11) unsigned | NO | MUL | NULL | |
| date | datetime | NO | MUL | 0000-00-00 00:00:00 | |
| ranking_score | decimal(5,2) | NO | MUL | 0.00 | |
這是您的查詢:
SELECT count(distinct u.id) AS count_1
FROM offers_clicks oc JOIN
users u
ON oc.user_id = u.id
WHERE u.country IN ('US') AND u.last_active > '2015-02-26' AND
oc.date > '2015-02-14' AND
oc.ranking_score > 0.24 AND oc.ranking_score < 3.49;
首先,您可以考慮將查詢編寫為:而不是count(distinct)
SELECT count(*) AS count_1
FROM users u
WHERE u.country IN ('US') AND u.last_active > '2015-02-26' AND
EXISTS (SELECT 1
FROM offers_clicks oc
WHERE oc.user_id = u.id AND
oc.date > '2015-02-14' AND
oc.ranking_score > 0.24 AND oc.ranking_score < 3.49
)
然后,此查詢的最佳索引是: users(country, last_active, id)
和offers_clicks(user_id, date, ranking_score)
或offers_clicks(user_id, ranking_score, date)
。
SELECT count(distinct u.id) AS count_1
FROM users u
STRAIGHT_JOIN offers_clicks oc
ON oc.user_id = u.id
WHERE
u.country IN ('US')
AND u.last_active > '2015-02-26'
AND oc.date > '2015-02-14'
AND oc.ranking_score > 0.24
AND oc.ranking_score < 3.49;
確保您擁有用戶索引 - ( id
, last_active
, country
)列和offers_clicks - ( user_id
, date
, ranking_score
)
或者您可以撤消訂單
SELECT count(distinct u.id) AS count_1
FROM offers_clicks oc
STRAIGHT_JOIN users u
ON oc.user_id = u.id
WHERE
u.country IN ('US')
AND u.last_active > '2015-02-26'
AND oc.date > '2015-02-14'
AND oc.ranking_score > 0.24
AND oc.ranking_score < 3.49;
確保您在offers_clicks - ( user_id
)列和用戶 - ( id
, last_active
, country
)上有索引
SELECT count(users.id) AS count_1
FROM users
INNER JOIN
(SELECT
DISTINCT user_id
FROM
offers_clicks
WHERE offers_clicks.date > '2015-02-14'
AND offers_clicks.ranking_score < 3.49
AND offers_clicks.ranking_score > 0.24
) as clicks
ON clicks.user_id = users.id
WHERE users.country IN ('US')
AND users.last_active > '2015-02-26'
你能為sqlfiddle提供一些數據嗎?
你能告訴我這個查詢的執行時間是多少:
SELECT
DISTINCT user_id
FROM
offers_clicks
WHERE offers_clicks.date > '2015-02-14'
AND offers_clicks.ranking_score < 3.49
AND offers_clicks.ranking_score > 0.24
編輯問題這個問題需要多長時間?
SELECT
DISTINCT user_id
FROM
offers_clicks USE INDEX (user_id_4)
WHERE offers_clicks.date > '2015-02-14'
AND offers_clicks.ranking_score < 3.49
AND offers_clicks.ranking_score > 0.24
嘗試這樣做:
SELECT COUNT(users.id)
FROM users, offers_clicks
WHERE users.country = 'US'
AND users.last_active > '2015-02-26'
AND offers_clicks.user_id = users.id
AND offers_clicks.date > '2015-02-14'
AND offers_clicks.ranking_score < 3.49
AND offers_clicks.ranking_score > 0.24;
試試這個:
SELECT count(distinct users.id) AS count_1
FROM users USE index (<see below>)
JOIN offers_clicks USE index (<see below>)
ON offers_clicks.user_id = users.id
AND offers_clicks.date BETWEEN '2015-02-14' AND CURRENT_DATE
AND offers_clicks.ranking_score BETWEEN 0.24 AND 3.49
WHERE users.country = 'US'
AND users.last_active BETWEEN '2015-02-26' AND CURRENT_DATE
確保users(country, last_active, id)
和offers_clicks(user_id, ranking_score, date)
上有索引並USE
它們。
讓我知道它是如何表現的,如果它有效,我會解釋原因。
首先,我還認為你應該使用join,並嘗試只加入你在結果中真正需要的行。
對於table offers_clicks,我認為你不應該使用索引user_id_3並使用user_id_2,因為user_id_2的基數高於user_id_3的基數(相應於你的索引),它應該更快。
SELECT
count(distinct(users.id)) AS count_1
FROM users USE INDEX (country_2)
JOIN offers_clicks USE INDEX (user_id_2)
ON offers_clicks.user_id = users.id
AND offers_clicks.date > '2015-02-14'
AND offers_clicks.ranking_score < 3.49
AND offers_clicks.ranking_score > 0.24
WHERE users.country = 'US' AND users.last_active > '2015-02-26'
;
對於此查詢,您不需要更改表,這就是為什么我認為您可以嘗試它。
嘗試減少日期范圍可能會有所幫助,結果是減少結果中的行數,它應該更快。
不確定我會有所幫助......
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.