[英]Slow Postgres 9.3 queries
I'm trying to figure out if I can speed up two queries on a database storing email messages. 我试图找出是否可以加快对存储电子邮件消息的数据库的两个查询。 Here's the table:
这是桌子:
\d messages;
Table "public.messages"
Column | Type | Modifiers
----------------+---------+-------------------------------------------------------
id | bigint | not null default nextval('messages_id_seq'::regclass)
created | bigint |
updated | bigint |
version | bigint |
threadid | bigint |
userid | bigint |
groupid | bigint |
messageid | text |
date | bigint |
num | bigint |
hasattachments | boolean |
placeholder | boolean |
compressedmsg | bytea |
revcount | bigint |
subject | text |
isreply | boolean |
likes | bytea |
isspecial | boolean |
pollid | bigint |
username | text |
fullname | text |
Indexes:
"messages_pkey" PRIMARY KEY, btree (id)
"idx_unique_message_messageid" UNIQUE, btree (groupid, messageid)
"idx_unique_message_num" UNIQUE, btree (groupid, num)
"idx_group_id" btree (groupid)
"idx_message_id" btree (messageid)
"idx_thread_id" btree (threadid)
"idx_user_id" btree (userid)
Output from SELECT relname, relpages, reltuples::numeric, pg_size_pretty(pg_table_size(oid)) FROM pg_class WHERE oid='messages'::regclass;
SELECT relname, relpages, reltuples::numeric, pg_size_pretty(pg_table_size(oid)) FROM pg_class WHERE oid='messages'::regclass;
is 是
relname | relpages | reltuples | pg_size_pretty
----------+----------+-----------+----------------
messages | 1584913 | 7337880 | 32 GB
Some possibly relevant postgres config values: 一些可能相关的postgres配置值:
shared_buffers = 1536MB
effective_cache_size = 4608MB
work_mem = 7864kB
maintenance_work_mem = 384MB
Here are the explain analyze outputs: 这是解释分析输出:
explain analyze SELECT * FROM messages WHERE groupid=1886 ORDER BY id ASC LIMIT 20 offset 4440;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=479243.63..481402.39 rows=20 width=747) (actual time=14167.374..14167.408 rows=20 loops=1)
-> Index Scan using messages_pkey on messages (cost=0.43..19589605.98 rows=181490 width=747) (actual time=14105.172..14167.188 rows=4460 loops=1)
Filter: (groupid = 1886)
Rows Removed by Filter: 2364949
Total runtime: 14167.455 ms
(5 rows)
The second query: 第二个查询:
explain analyze SELECT * FROM messages WHERE groupid=1886 ORDER BY created ASC LIMIT 20 offset 4440;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=538650.72..538650.77 rows=20 width=747) (actual time=671.983..671.992 rows=20 loops=1)
-> Sort (cost=538639.62..539093.34 rows=181490 width=747) (actual time=670.680..671.829 rows=4460 loops=1)
Sort Key: created
Sort Method: top-N heapsort Memory: 7078kB
-> Bitmap Heap Scan on messages (cost=7299.11..526731.31 rows=181490 width=747) (actual time=84.975..512.969 rows=200561 loops=1)
Recheck Cond: (groupid = 1886)
-> Bitmap Index Scan on idx_unique_message_num (cost=0.00..7253.73 rows=181490 width=0) (actual time=57.239..57.239 rows=203423 loops=1)
Index Cond: (groupid = 1886)
Total runtime: 672.787 ms
(9 rows)
This is on an SSD, 8GB Ram instance, load average is usually around 0.15. 这是在8GB Ram实例的SSD上,平均负载通常为0.15左右。
I'm definitely no expert. 我绝对不是专家。 Is this a case of the data just being spread throughout the disk?
这是否只是数据散布在整个磁盘上的情况? Is my only solution to use CLUSTER?
我是使用CLUSTER的唯一解决方案吗?
One thing I don't understand is why is it using idx_unique_message_num
as the index for the second query. 我不明白的一件事是为什么它使用
idx_unique_message_num
作为第二个查询的索引。 And why is ordering by ID so much slower? 为什么按ID订购的速度这么慢?
If there are many records with groupid=1886
(from comment: there are 200,563), to get to records at an OFFSET of a sorted subset of rows, would require sorting (or an equivalent heap algorithm) which is slow. 如果有许多记录的组
groupid=1886
(从注释:有200,563),要在行的排序子集的OFFSET处获取记录,将需要进行排序(或等效的堆算法),这很慢。
This could be solved by adding an index. 这可以通过添加索引来解决。 In this case, one on
(groupid,id)
and another on (groupid,created)
. 在这种情况下,一个在
(groupid,id)
,另一个在(groupid,created)
。
From comment: This indeed helped, taking down the runtime to 5ms-10ms. 摘自评论:这确实有所帮助,将运行时间减少到5ms-10ms。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.