简体   繁体   English

当所有内容都在索引中而不是仅索引扫描时,为什么Postgresql进行Seq扫描?

[英]Why does Postgresql do Seq Scan when everything is in index instead of index-only-scan?

Please have a look at a simple example: 请看一个简单的例子:

=> create table t1 ( a int, b int, c int );
CREATE TABLE

=> insert into t1 select a, a, a from generate_series(1,100) a;
INSERT 0 100

=> create index i1 on t1(b);
CREATE INDEX

=> vacuum t1;
VACUUM

=> explain analyze select b from t1 where b = 10;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Seq Scan on t1  (cost=0.00..2.25 rows=1 width=4) (actual time=0.016..0.035 rows=1 loops=1)
   Filter: (b = 10)
   Rows Removed by Filter: 99
 Planning Time: 0.082 ms
 Execution Time: 0.051 ms
(5 rows)

You can see that I select b and query on b only. 你可以看到,我选择b和查询的b只。 And also vacuum t1; 还有vacuum t1; manually to make sure the Visibility information is stored in the index. 手动确保“可见性”信息存储在索引中。

But why does Postgresql still do Seq Scan instead of index-only-scan? 但是为什么Postgresql仍然进行Seq扫描而不是仅索引扫描? Thanks a lot. 非常感谢。

Edited: 编辑:

After adding more rows, it will do index-only-scan: 添加更多行后,它将执行仅索引扫描:

=> insert into t1 select a, a, a from generate_series(1,2000) a;

=> vacuum t1;

=> explain analyze select b from t1 where b = 10;
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Index Only Scan using i1 on t1  (cost=0.28..4.45 rows=10 width=4) (actual time=0.038..0.039 rows=1 loops=1)
   Index Cond: (b = 10)
   Heap Fetches: 0
 Planning Time: 0.186 ms
 Execution Time: 0.058 ms
(5 rows)

It seems like PostgreSQL doesn't like index-only-scan when the rows number is small. 当行数很小时,PostgreSQL似乎不喜欢仅索引扫描。

Since nobody want to provide a detail explanation, I will write a simple answer here. 由于没有人愿意提供详细的解释,因此我将在此处写一个简单的答案。

From @a_horse_with_no_name: 来自@a_horse_with_no_name:

100 rows will fit on a single data block, so doing a seq scan will only require a single I/O operation and the index only scan would require the same. 100个行将适合单个数据块,因此执行seq扫描将仅需要单个I / O操作,而仅索引扫描将需要相同的操作。 Use explain (analyze, buffers) to see more details on the blocks (=buffers) needed by the query 使用explain (analyze, buffers)查看查询所需的块(=缓冲区)的更多详细信息

From https://www.postgresql.org/docs/current/indexes-examine.html : https://www.postgresql.org/docs/current/indexes-examine.html

It is especially fatal to use very small test data sets. 使用非常小的测试数据集尤其致命。 While selecting 1000 out of 100000 rows could be a candidate for an index, selecting 1 out of 100 rows will hardly be, because the 100 rows probably fit within a single disk page, and there is no plan that can beat sequentially fetching 1 disk page. 虽然从100000行中选择1000行可能是索引的候选对象,但在100行中选择1行几乎是不可能的,因为100行可能适合单个磁盘页面,并且没有计划可以胜任依次获取1个磁盘页面的计划。 。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM