[英]How to optimize this query in Postgres?
I'm trying to get a count of 'visit' records created by joining values from 3 tables.我正在尝试计算通过连接 3 个表中的值创建的“访问”记录。 I have a simple query below but it takes almost 30 min on the db.我在下面有一个简单的查询,但在数据库上需要将近 30 分钟。 Is there a way to optimize this query any further?有没有办法进一步优化这个查询?
select a."ClientID" as ClientID, b."ProviderID" as ProviderID, count(1) as VisitCount
from "Log" c
inner join "MessageDetail" b on c."MessageDetailID" = b."MessageDetailID"
inner join "Message" a on a."MessageID" = b."MessageID"
where a."CreatedUTCDate" >= NOW() - INTERVAL '1 HOUR'
group by a."ClientID", b."ProviderID"
Example Result示例结果
ClientID ProviderID VisitCount
3245cf64-test-4d05-9d5d-345653566455 677777 1
3245cf64-test-4d05-9d5d-345653566455 677777 1
0284a326-test-4757-b00e-34563465dfgg 9999 5
Explain plan解释计划
GroupAggregate (cost=6529150.62..6529160.28 rows=483 width=48)
Group Key: a."ClientID", b."ProviderID"
-> Sort (cost=6529150.62..6529151.83 rows=483 width=40)
Sort Key: a."ClientID", b."ProviderID"
-> Nested Loop (cost=1.00..6529129.09 rows=483 width=40)
-> Nested Loop (cost=0.56..6509867.54 rows=3924 width=48)
-> Seq Scan on "Message" a (cost=0.00..6274917.96 rows=3089 width=44)
Filter: ("CreatedUTCDate" >= (now() - '01:00:00'::interval))
-> Index Scan using "ix_MessageDetail_MessageId" on "MessageDetail" b (cost=0.56..75.40 rows=66 width=20)
Index Cond: ("MessageID" = a."MessageID")
-> Index Only Scan using "ix_Log_MessageDetailId" on "Log" c (cost=0.43..4.90 rows=1 width=8)
Index Cond: ("MessageDetailID" = b."MessageDetailID")
Explain Analyze Plan解释分析计划
GroupAggregate (cost=6529127.35..6529137.01 rows=483 width=48) (actual time=791639.382..791661.555 rows=118 loops=1)
Group Key: a."ClientID", b."ProviderID"
-> Sort (cost=6529127.35..6529128.56 rows=483 width=40) (actual time=791639.373..791649.235 rows=64412 loops=1)
Sort Key: a."ClientID", b."ProviderID"
Sort Method: external merge Disk: 3400kB
-> Nested Loop (cost=1.00..6529105.82 rows=483 width=40) (actual time=25178.920..791410.769 rows=64412 loops=1)
-> Nested Loop (cost=0.56..6509844.55 rows=3924 width=48) (actual time=25178.874..790954.577 rows=65760 loops=1)
-> Seq Scan on "Message" a (cost=0.00..6274894.96 rows=3089 width=44) (actual time=25178.799..790477.178 rows=25121 loops=1)
Filter: ("CreatedUTCDate" >= (now() - '01:00:00'::interval))
Rows Removed by Filter: 30839080
-> Index Scan using "ix_MessageDetail_MessageId" on "MessageDetail" b (cost=0.56..75.40 rows=66 width=20) (actual time=0.009..0.016 rows=3 loops=25121)
Index Cond: ("MessageID" = a."MessageID")
-> Index Only Scan using "ix_Log_MessageDetailId" on "Log" c (cost=0.43..4.90 rows=1 width=8) (actual time=0.005..0.006 rows=1 loops=65760)
Index Cond: ("MessageDetailID" = b."MessageDetailID")
Heap Fetches: 65590
Planning time: 38.501 ms
Execution time: 791662.728 ms
This part of the execution plan这部分执行计划
-> Seq Scan on "Message" a (...) (actual time=25178.799..790477.178 rows=25121 loops=1)
Filter: ("CreatedUTCDate" >= (now() - '01:00:00'::interval))
Rows Removed by Filter: 30839080
proves that an Index on "CreatedUTCDate"
would speed up this query quite a lot:证明"CreatedUTCDate"
上的索引会大大加快这个查询的速度:
almost the complete execution time is spent in this sequential scan几乎整个执行时间都花在了这个顺序扫描中
you scan over 30 million rows to find 25000, so the filter condition is highly selective您扫描超过 3000 万行以找到 25000 行,因此过滤条件具有高度选择性
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.