简体   繁体   English

Postgres 进行昂贵的索引扫描

[英]Postgres does a expensive index scan

I have a table with ~4.5 million data for 3 months.我有一张表,其中包含 3 个月的约 450 万条数据。 I have indexed the "accessedAt" column which stores time in epoch and has datatype BIGINT, When I ran the query postgres did a bit map index scan on 700K rows.我已经对“accessedAt”列进行了索引,该列在 epoch 中存储时间并具有 BIGINT 数据类型,当我运行查询时,postgres 对 700K 行进行了位 map 索引扫描。 in (~48.2s).在(〜48.2秒)。 But when I dropped the index it did a seq scan on 700K rows in (~4s).但是当我删除索引时,它在(约 4 秒)内对 700K 行进行了 seq 扫描。

[QUERY] explain analyze select id from access_histories where "accessedAt" >= 1631903400 and "accessedAt" <= 1633112999; ie from 17 sept 2021 - 01 oct 2021.即从 2021 年 9 月 17 日至 2021 年 10 月 1 日。

Bitmap Heap Scan on access_histories  (cost=14655.35..144416.85 rows=715992 width=8) (actual time=198.176..48191.067 rows=715535 loops=1)
   Recheck Cond: (("accessedAt" >= 1631903400) AND ("accessedAt" <= 1633112999))
   Rows Removed by Index Recheck: 1716759
   Heap Blocks: exact=48015 lossy=33133
   ->  Bitmap Index Scan on "access_histories_accessedAt_idx"  (cost=0.00..14476.35 rows=715992 width=0) (actual time=185.932..185.937 rows=715535 loops=1)
         Index Cond: (("accessedAt" >= 1631903400) AND ("accessedAt" <= 1633112999))
 Planning Time: 0.553 ms
 Execution Time: 48234.459 ms

After dropping the index.删除索引后。

 Seq Scan on access_histories  (cost=0.00..155405.02 rows=715992 width=8) (actual time=2.560..3943.902 rows=715535 loops=1)
   Filter: (("accessedAt" >= 1631903400) AND ("accessedAt" <= 1633112999))
   Rows Removed by Filter: 4234466
 Planning Time: 0.206 ms
 Execution Time: 3982.995 ms
(5 rows)

So my question is why is postgres opting to go for index scan although sequential is less expensive.所以我的问题是为什么 postgres 选择 go 进行索引扫描,尽管顺序的成本较低。

ps I know that postgres will go for seq scan if it has to query 5-10% of total data . ps 我知道如果 postgres 必须查询总数据的 5-10%,它将 go 进行 seq 扫描

There are several things that can work together in this:有几件事可以协同工作:

  1. work_mem is too low for the bitmap heap scan to be as effective as PostgreSQL things it should be ( Heap Blocks: lossy=33133 )对于work_mem堆扫描而言,work_mem 太低,无法像 PostgreSQL 一样有效( Heap Blocks: lossy=33133

  2. During the second query, more data are cached (use EXPLAIN (ANALYZE, BUFFERS) to keep track of that)在第二次查询期间,缓存了更多数据(使用EXPLAIN (ANALYZE, BUFFERS)来跟踪)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM