简体   繁体   English

PostgreSQL Bitmap堆扫描索引非常慢,但Index Only Scan很快

[英]PostgreSQL Bitmap Heap Scan on index is very slow but Index Only Scan is fast

I create a table with 43kk rows, populate them with values 1..200. 我创建一个43kk行的表,用值1..200填充它们。 So ~220k per each number spreaded through the table. 因此,通过表格传播的每个数字约为220k。

create table foo (id integer primary key, val bigint);
insert into foo
  select i, random() * 200 from generate_series(1, 43000000) as i;
create index val_index on foo(val);
vacuum analyze foo;
explain analyze select id from foo where val = 55;

Result: http://explain.depesz.com/s/fdsm 结果: http//explain.depesz.com/s/fdsm

I expect total runtime < 1s, is it possible? 我希望总运行时间<1s,是否可能? I have SSD, core i5 (1,8), 4gb RAM. 我有SSD,核心i5(1,8),4GB RAM。 9,3 Postgres. 9,3 Postgres。

If I use Index Only scan it works very fast: 如果我使用Index Only扫描,它的工作速度非常快:

explain analyze select val from foo where val = 55;

http://explain.depesz.com/s/7hm http://explain.depesz.com/s/7hm

But I need to select id not val so Incex Only scan is not suitable in my case. 但我需要选择id而不是val,所以Incex Only扫描不适合我的情况。

Thanks in advance! 提前致谢!

Additional info: 附加信息:

SELECT relname, relpages, reltuples::numeric, pg_size_pretty(pg_table_size(oid)) 
FROM pg_class WHERE oid='foo'::regclass;

Result: 结果:

"foo";236758;43800000;"1850 MB"

Config: 配置:

"cpu_index_tuple_cost";"0.005";""
"cpu_operator_cost";"0.0025";""
"cpu_tuple_cost";"0.01";""
"effective_cache_size";"16384";"8kB"
"max_connections";"100";""
"max_stack_depth";"2048";"kB"
"random_page_cost";"4";""
"seq_page_cost";"1";""
"shared_buffers";"16384";"8kB"
"temp_buffers";"1024";"8kB"
"work_mem";"204800";"kB"

I have got answer here: http://ask.use-the-index-luke.com/questions/235/postgresql-bitmap-heap-scan-on-index-is-very-slow-but-index-only-scan-is-fast 我在这里得到了答案: http//ask.use-the-index-luke.com/questions/235/postgresql-bitmap-heap-scan-on-index-is-very-slow-but-index-only-扫描是快速

The trick is to use composite index for id and value: 诀窍是使用id和value的复合索引:

create index val_id_index on foo(val, id);

So Index Only scan will be used, but I can select id now. 因此,将使用仅索引扫描,但我现在可以选择ID

select id from foo where val = 55;

Result: 结果:

http://explain.depesz.com/s/nDt3 http://explain.depesz.com/s/nDt3

But this works ONLY in Postgres with version 9.2+. 但这仅适用于版本9.2+的Postgres。 If you have forced to use versions below try another options. 如果您被迫使用以下版本,请尝试其他选项。

Although you're querying only 0,5% of the table, or ~10MB worth of data (out of nearly 2GB table), values of interest are spread evenly across whole table. 虽然您只查询表的0.5%,或者大约10MB的数据(在近2GB的表中),但是感兴趣的值在整个表中均匀分布。

You can see it in the first plan you've provided: 您可以在您提供的第一个计划中看到它:

  • BitmapIndexScan completes in 123.172ms BitmapIndexScan在123.172ms内完成
  • BitmapHeapScan takes 17055.046ms. BitmapHeapScan需要17055.046ms。

You can try clustering your tables based on index order, which will put rows together on the same pages. 您可以尝试根据索引顺序对表进行集群,这会将行放在同一页面上。 On my SATA disks I have the following: 在我的SATA磁盘上,我有以下内容:

SET work_mem TO '300MB';
EXPLAIN (analyze,buffers) SELECT id FROM foo WHERE val = 55;

  Bitmap Heap Scan on foo  (...) (actual time=90.315..35091.665 rows=215022 loops=1)
    Heap Blocks: exact=140489
    Buffers: shared hit=20775 read=120306 written=24124

SET maintenance_work_mem TO '1GB';
CLUSTER foo USING val_index;
EXPLAIN (analyze,buffers) SELECT id FROM foo WHERE val = 55;

  Bitmap Heap Scan on foo  (...) (actual time=49.215..407.505 rows=215022 loops=1)
    Heap Blocks: exact=1163
    Buffers: shared read=1755

Of course, this is a one-time operation and it'll get longer bit-by-bit over the time. 当然,这是一次性操作,并且随着时间的推移逐渐变长。

You can try to reduce random_page_cost -- for SSD it can be 1. Second, you can increase a work_mem .. 10MB is relatively low value for current servers with gigabytes RAM. 您可以尝试减少random_page_cost - 对于SSD,它可以是1.其次,您可以增加work_mem。对于具有千兆字节RAM的当前服务器,10MB是相对较低的值。 You should to recheck effective_cache_size - it can be too low too. 你应该重新检查effective_cache_size - 它也可能太低了。

work_mem * max_connection * 2 + shared_buffers < RAM dedicated for Postgres
effective_cache ~ shared_buffers + file system cache

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM