Postgres 进行昂贵的索引扫描

Question

I have a table with ~4.5 million data for 3 months.我有一张表，其中包含 3 个月的约 450 万条数据。 I have indexed the "accessedAt" column which stores time in epoch and has datatype BIGINT, When I ran the query postgres did a bit map index scan on 700K rows.我已经对“accessedAt”列进行了索引，该列在 epoch 中存储时间并具有 BIGINT 数据类型，当我运行查询时，postgres 对 700K 行进行了位 map 索引扫描。 in (~48.2s).在（〜48.2秒）。 But when I dropped the index it did a seq scan on 700K rows in (~4s).但是当我删除索引时，它在（约 4 秒）内对 700K 行进行了 seq 扫描。

[QUERY] explain analyze select id from access_histories where "accessedAt" >= 1631903400 and "accessedAt" <= 1633112999; ie from 17 sept 2021 - 01 oct 2021.即从 2021 年 9 月 17 日至 2021 年 10 月 1 日。

Bitmap Heap Scan on access_histories  (cost=14655.35..144416.85 rows=715992 width=8) (actual time=198.176..48191.067 rows=715535 loops=1)
   Recheck Cond: (("accessedAt" >= 1631903400) AND ("accessedAt" <= 1633112999))
   Rows Removed by Index Recheck: 1716759
   Heap Blocks: exact=48015 lossy=33133
   ->  Bitmap Index Scan on "access_histories_accessedAt_idx"  (cost=0.00..14476.35 rows=715992 width=0) (actual time=185.932..185.937 rows=715535 loops=1)
         Index Cond: (("accessedAt" >= 1631903400) AND ("accessedAt" <= 1633112999))
 Planning Time: 0.553 ms
 Execution Time: 48234.459 ms

After dropping the index.删除索引后。

 Seq Scan on access_histories  (cost=0.00..155405.02 rows=715992 width=8) (actual time=2.560..3943.902 rows=715535 loops=1)
   Filter: (("accessedAt" >= 1631903400) AND ("accessedAt" <= 1633112999))
   Rows Removed by Filter: 4234466
 Planning Time: 0.206 ms
 Execution Time: 3982.995 ms
(5 rows)

So my question is why is postgres opting to go for index scan although sequential is less expensive.所以我的问题是为什么 postgres 选择 go 进行索引扫描，尽管顺序的成本较低。

ps I know that postgres will go for seq scan if it has to query 5-10% of total data . ps 我知道如果 postgres 必须查询总数据的 5-10%，它将 go 进行 seq 扫描。

Answer 1

There are several things that can work together in this:有几件事可以协同工作：

work_mem is too low for the bitmap heap scan to be as effective as PostgreSQL things it should be ( Heap Blocks: lossy=33133 )对于work_mem堆扫描而言，work_mem 太低，无法像 PostgreSQL 一样有效（ Heap Blocks: lossy=33133 ）
During the second query, more data are cached (use EXPLAIN (ANALYZE, BUFFERS) to keep track of that)在第二次查询期间，缓存了更多数据（使用EXPLAIN (ANALYZE, BUFFERS)来跟踪）

Postgres 进行昂贵的索引扫描

问题描述

1 个解决方案

解决方案1
0 2021-12-12 04:35:39

Postgres 进行昂贵的索引扫描

问题描述

1 个解决方案

解决方案1 0 2021-12-12 04:35:39

解决方案1
0 2021-12-12 04:35:39