简体   繁体   中英

Postgresql does seq_scan when the index applies. Why?

I have a query with a join on a varchar(24) primary key. The reasons for this being a key are legacy and targeted for change. However, the postgresql query planner insists on doing a sequential scan which seems unreasonable to me. I back up my claim of "unreasonable" with the fact that "SET enable_seqscan = off" speeds up this query by a factor of 8.

I've run "vacuum analyze"; I've played with statistics settings, but have had no luck so far.

The query is

select inventry.id, inventry.count, sum(invenwh.count) 
from invenwh join inventry on inventry.id=invenwh.id
where inventry.product_c='CAT17' 
group by 1, 2;

The following sets up the database for running this query.

drop table if exists inventry;
drop table if exists inwh;
drop table if exists invenwh;
drop table if exists inprodcategory;

-- Create 50 product categories.
create table inprodcategory as 
select i as id, concat('CAT', lpad(i::text, 2, '0'))::varchar(10) as category
from generate_series(1, 50, 1) as i;

-- Create 245,000 inventory items
create table inventry as 
select 
    concat('ITEM', lpad(i::text, 6, '0'))::varchar(24) as id, 
    concat('Item #', i::text)::varchar(50) as descr_1,
    c.category as product_c,
    (case when random() < 0.05 then (random()*70)::int else 0::int end) as count
from generate_series(1, 245000, 1) as i
    join inprodcategory as c on c.id=(i%50)::int;

-- Create 70 warehouses
create table inwh as 
select concat('WAREHOUSE', lpad(i::text, 2, '0'))::varchar(10) as warehouse
from generate_series(1, 70, 1) as i;

-- Create (ugly) cross-join table with counts/warehouse
create table invenwh as 
select id, warehouse, 
    (case when random() < 0.05 then (random()*10)::int else 0::int end) as count
from inventry, inwh;

create index on invenwh (id);
create index on inventry (id);

After running the above, you can run the query. On my hardware with an SSD, i7 and 16gb of ram, it takes 4 seconds, but if I run "set enable_seqscan=off", it takes about 500ms.

Edit: add explain(analyze, buffers)

HashAggregate  (cost=449773.25..449822.25 rows=4900 width=19) (actual time=4180.006..4181.092 rows=4900 loops=1)
  Group Key: inventry.id, inventry.count
  Buffers: shared hit=4526 read=121051
  ->  Hash Join  (cost=5058.50..447200.75 rows=343000 width=19) (actual time=1285.800..4086.398 rows=343000 loops=1)
        Hash Cond: ((invenwh.id)::text = (inventry.id)::text)
        Buffers: shared hit=4526 read=121051
        ->  Seq Scan on invenwh  (cost=0.00..291651.00 rows=16807000 width=15) (actual time=0.077..1949.843 rows=16807000 loops=1)
              Buffers: shared hit=2530 read=121051
        ->  Hash  (cost=4997.25..4997.25 rows=4900 width=15) (actual time=48.897..48.897 rows=4900 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 230kB
              Buffers: shared hit=1996
              ->  Seq Scan on inventry  (cost=0.00..4997.25 rows=4900 width=15) (actual time=21.903..47.031 rows=4900 loops=1)
                    Filter: ((product_c)::text = 'CAT17'::text)
                    Rows Removed by Filter: 235200
                    Buffers: shared hit=1996
Planning time: 4.266 ms
Execution time: 4181.395 ms

Edit: Specific follow-up questions

Thanks to @a_horse_with_no_name (big thank you!!) it seems like lowering random_page_cost is the thing to do. This seems more-or-less in agreement with https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

Q: Is there any benchmark I can run to discover optimal values for random_page_cost? In production, I'm on a SCSI disk (LSI MR9260-8i).

Q: I feel like statistics may also be relevant here, but I'm coming up empty on a pg-stats-for-dummies type page on the internet. Any hints on learning about stats?

When the costs estimated by the planner don't match the reality of the execution time, cost settings should be adjusted to better match your hardware.

The various knobs are documented at Planner Cost Constants .

In particular there is this advice on random_page_cost that's relevant to your case:

Storage that has a low random read cost relative to sequential, eg solid-state drives, might also be better modeled with a lower value for random_page_cost.

See also Random Page Cost Revisited for more tuning advice on this parameter with 5 different storage types.

TL;DR: for an SSD, try first 1.5 for random_page_cost .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM