![](/img/trans.png)
[英]Optimize JOIN -> GROUP BY query in PostgreSQL: all indexes are already there
[英]Proper indexes for this join query in Postgresql
我有兩個表:
用戶名| 名稱 ..
拉取請求ID | user_id | created_at | ...
我需要獲取所有加入他們的用戶以及他們特定年份的拉取請求的數量。 所以我這樣寫了一個查詢:
SELECT users.*, COUNT(pull_requests.id) as pull_requests_count
FROM "users" INNER JOIN
"pull_requests"
ON "pull_requests"."user_id" = "users"."id"
WHERE (EXTRACT(year FROM pull_requests.created_at) = 2013)
GROUP BY users.id
我最初有索引,
pull_requests.user_id(btree)。 在做解釋時,我得到了這個:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=18.93..18.96 rows=3 width=2775)
-> Hash Join (cost=14.13..18.92 rows=3 width=2775)
Hash Cond: (users.id = pull_requests.user_id)
-> Seq Scan on users (cost=0.00..4.08 rows=108 width=2771)
-> Hash (cost=14.09..14.09 rows=3 width=8)
-> Bitmap Heap Scan on pull_requests (cost=4.28..14.09 rows=3 width=8)
Recheck Cond: (date_part('year'::text, created_at) = 2013::double precision)
-> Bitmap Index Scan on pull_req_extract_year_created_at_ix (cost=0.00..4.28 rows=3 width=0)
Index Cond: (date_part('year'::text, created_at) = 2013::double precision)
然后我添加了這樣的索引:
CREATE INDEX pull_req_extract_year_created_at_ix ON pull_requests (EXTRACT(year FROM created_at));
現在我的解釋是:
QUERY PLAN
--------------------------------------------------------------------------------------------
HashAggregate (cost=63.99..64.02 rows=3 width=2775)
-> Hash Join (cost=59.19..63.98 rows=3 width=2775)
Hash Cond: (users.id = pull_requests.user_id)
-> Seq Scan on users (cost=0.00..4.08 rows=108 width=2771)
-> Hash (cost=59.16..59.16 rows=3 width=8)
-> Seq Scan on pull_requests (cost=0.00..59.16 rows=3 width=8)
Filter: (date_part('year'::text, created_at) = 2013::double precision)
我仍然獲得100左右行的6.6毫秒。 如何進一步優化呢?
謝謝!
嘗試將兩個索引合並為一個:
CREATE INDEX pr_ix ON pull_requests(EXTRACT(year FROM created_at), user_id);
然后將查詢表述為:
SELECT users.*, pull_requests_count
FROM "users" INNER JOIN
(select user_id, count(*) as pull_requests_count
from "pull_requests"
WHERE (EXTRACT(year FROM pull_requests.created_at) = 2013)
group by user_id
) pr
ON pr."user_id" = "users"."id";
索引完全覆蓋了子查詢,因此不需要原始表,只需進行索引掃描即可。 然后可以將其重新加入用戶。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.