[英]Why is this query using the index differently when i change the range of search?
select
a.date
a.id,
a.address,
a.title,
a.state
from activities a
where a.date >= "start" and a.date <= "end"
and exists ( select worker_id from worker_to_activity where worker_id = "worker_id" and activity_id = a.id )
limit 11;
b) users
表
create index act_fetch_idx (date, id, address, title, state) on activities;
c) 和一個worker_to_activity
表,用於將工人分配給指定的活動
+----+---------------------+
| id | date |
+----+---------------------+
| 1 | 2022-09-14 08:00:00 |
| 2 | 2022-09-14 09:00:00 |
| 3 | 2022-09-14 10:00:00 |
| 50 | 2022-09-14 10:55:49 |
| 4 | 2022-09-14 11:00:00 |
| 5 | 2022-09-14 12:00:00 |
| 6 | 2022-09-14 13:00:00 |
| 7 | 2022-09-14 14:00:00 |
| 8 | 2022-09-14 15:00:00 |
| 9 | 2022-09-14 16:00:00 |
| 10 | 2022-09-14 17:00:00 |
| 11 | 2022-09-14 18:00:00 |
| 12 | 2022-09-14 19:00:00 |
| 13 | 2022-09-15 08:00:00 |
| 14 | 2022-09-15 09:00:00 |
| 15 | 2022-09-15 10:00:00 |
| 16 | 2022-09-15 11:00:00 |
| 17 | 2022-09-15 12:00:00 |
| 18 | 2022-09-15 13:00:00 |
| 19 | 2022-09-15 14:00:00 |
| 20 | 2022-09-15 15:00:00 |
我有一個非常簡單的查詢,用於搜索指定用戶所在的特定時間范圍內的所有活動,如下所示:
select
a.date
a.id,
a.address,
a.title,
a.state
from activities a
where a.date >= "2022-09-14 00:00" and a.date <= "2022-09-14 23:59"
and exists ( select worker_id from worker_to_activity where worker_id = "worker_id" and activity_id = a.id )
limit 11;
而且,最后但並非最不重要的一點是,我為此目的有一個索引:
+----+---------------------+
| id | date |
+----+---------------------+
| 1 | 2022-09-14 08:00:00 |
| 2 | 2022-09-14 09:00:00 |
| 3 | 2022-09-14 10:00:00 |
| 50 | 2022-09-14 10:55:49 |
| 4 | 2022-09-14 11:00:00 |
| 5 | 2022-09-14 12:00:00 |
| 6 | 2022-09-14 13:00:00 |
| 7 | 2022-09-14 14:00:00 |
| 8 | 2022-09-14 15:00:00 |
| 9 | 2022-09-14 16:00:00 |
| 10 | 2022-09-14 17:00:00 |
+----+---------------------+
出於測試目的(查看索引是否有效),我特意插入了一個在其他活動之前發生但 ID 更高的活動,模擬有人插入比預期晚的活動(我在下面的視圖中談論活動 #50):
select
a.date
a.id,
a.address,
a.title,
a.state
from activities a
where a.date >= "2022-09-14 00:00" and a.date <= "2022-09-15 23:59"
and exists ( select worker_id from worker_to_activity where worker_id = "worker_id" and activity_id = a.id )
limit 11;
在此示例中,索引有效! 這是一個簡單的select id, date from activities
當我使用日期范圍時會發生一些奇怪的事情:
a)在這種情況下(使用“每日”范圍)索引有效:
+----+---------------------+
| id | date |
+----+---------------------+
| 1 | 2022-09-14 08:00:00 |
| 2 | 2022-09-14 09:00:00 |
| 3 | 2022-09-14 10:00:00 |
| 4 | 2022-09-14 11:00:00 |
| 5 | 2022-09-14 12:00:00 |
| 6 | 2022-09-14 13:00:00 |
| 7 | 2022-09-14 14:00:00 |
| 8 | 2022-09-14 15:00:00 |
| 9 | 2022-09-14 16:00:00 |
| 10 | 2022-09-14 17:00:00 |
| 11 | 2022-09-14 18:00:00 |
+----+---------------------+
Output:
+----+-------------+--------------------+------------+--------+----------------------------+---------------+---------+-----------------------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------+------------+--------+----------------------------+---------------+---------+-----------------------+------+----------+--------------------------+
| 1 | SIMPLE | a | NULL | range | PRIMARY,date,act_fetch_idx | act_fetch_idx | 4 | NULL | 13 | 100.00 | Using where; Using index |
| 1 | SIMPLE | worker_to_activity | NULL | eq_ref | PRIMARY,worker_id | PRIMARY | 522 | db.a.id,const | 1 | 100.00 | Using index |
+----+-------------+--------------------+------------+--------+----------------------------+---------------+---------+-----------------------+------+----------+--------------------------+
b) 當我使用更廣泛的范圍,即接下來兩天的活動時,索引“停止工作(?)”:
+----+-------------+--------------------+------------+--------+---------------------------------------------+-----------+---------+-------------------------------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------+------------+--------+---------------------------------------------+-----------+---------+-------------------------------------------+------+----------+-------------+
| 1 | SIMPLE | worker_to_activity | NULL | ref | PRIMARY,worker_id | worker_id | 514 | const | 23 | 100.00 | Using index |
| 1 | SIMPLE | a | NULL | eq_ref | PRIMARY,date,act_fetch_idx,count_activities | PRIMARY | 8 | db.worker_to_activity.activity_id | 1 | 53.49 | Using where |
+----+-------------+--------------------+------------+--------+---------------------------------------------+-----------+---------+-------------------------------------------+------+----------+-------------+
Output:
+----+---------------------+ | id | date | +----+---------------------+ | 1 | 2022-09-14 08:00:00 | | 2 | 2022-09-14 09:00:00 | | 3 | 2022-09-14 10:00:00 | | 4 | 2022-09-14 11:00:00 | | 5 | 2022-09-14 12:00:00 | | 6 | 2022-09-14 13:00:00 | | 7 | 2022-09-14 14:00:00 | | 8 | 2022-09-14 15:00:00 | | 9 | 2022-09-14 16:00:00 | | 10 | 2022-09-14 17:00:00 | | 11 | 2022-09-14 18:00:00 | +----+---------------------+
EXPLAIN
查詢的解釋略有不同,但我不明白是什么導致了這種行為:
a) EXPLAIN
a:
+----+-------------+--------------------+------------+--------+----------------------------+---------------+---------+-----------------------+------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------------------+------------+--------+----------------------------+---------------+---------+-----------------------+------+----------+--------------------------+ | 1 | SIMPLE | a | NULL | range | PRIMARY,date,act_fetch_idx | act_fetch_idx | 4 | NULL | 13 | 100.00 | Using where; Using index | | 1 | SIMPLE | worker_to_activity | NULL | eq_ref | PRIMARY,worker_id | PRIMARY | 522 | db.a.id,const | 1 | 100.00 | Using index | +----+-------------+--------------------+------------+--------+----------------------------+---------------+---------+-----------------------+------+----------+--------------------------+
b) EXPLAIN
b:
+----+-------------+--------------------+------------+--------+---------------------------------------------+-----------+---------+-------------------------------------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------------------+------------+--------+---------------------------------------------+-----------+---------+-------------------------------------------+------+----------+-------------+ | 1 | SIMPLE | worker_to_activity | NULL | ref | PRIMARY,worker_id | worker_id | 514 | const | 23 | 100.00 | Using index | | 1 | SIMPLE | a | NULL | eq_ref | PRIMARY,date,act_fetch_idx,count_activities | PRIMARY | 8 | db.worker_to_activity.activity_id | 1 | 53.49 | Using where | +----+-------------+--------------------+------------+--------+---------------------------------------------+-----------+---------+-------------------------------------------+------+----------+-------------+
天真的解決方案是使用ORDER BY
(它在執行計划和結果中產生與第一個場景相同的結果),但我的問題更多是關於為什么會發生這種情況:索引不應該在這兩種情況下對結果進行排序嗎? 我在 web 上進行了搜索,但找不到太多。
在此先感謝您的時間。
這些解釋是完全不同的。
第一個查詢注意到日期范圍“小”,所以它決定從activities
開始,獲取幾行,然后對照另一個表檢查每一行。
第二個查詢決定另一個表可能是一個更好的起點。
(警告:由於優化器沒有跨表統計信息,因此它很可能會選擇“錯誤”的順序來查看兩個表。)
在沒有ORDER BY
的情況下使用LIMIT
通常是“錯誤的”。 由於查詢計划以不同的方式生成行,因此您應該期望這 11 行不同並且排序不同。
您可以使用EXPLAIN FORMAT=JSON SELECT...
進行進一步實驗——這為優化器的想法提供了更多線索。 有關更多線索,請參閱“優化器跟蹤”。 我喜歡檢查Handler
程序計數: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#handler_counts
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.