[英]Why isn't Postgres using the index with Distinct?
I have this table: 我有这张桌子:
CREATE TABLE public.prodhistory (
curve_id int4 NOT NULL,
start_prod_date date NOT NULL,
prod_date date NOT NULL,
monthly_prod_rate float4 NOT NULL,
eff_date timestamp NOT NULL,
/* Keys */
CONSTRAINT prodhistorypk
PRIMARY KEY (curve_id, prod_date, start_prod_date, eff_date),
/* Foreign keys */
CONSTRAINT prodhistory2typecurves_fk
FOREIGN KEY (curve_id)
REFERENCES public.typecurves(curve_id)
) WITH (
OIDS = FALSE
);
CREATE INDEX prodhistory_idx_curve_id01
ON public.prodhistory
(curve_id);
with ~42M rows. 有~42M行。
And I execute this query: 我执行此查询:
SELECT DISTINCT curve_id FROM prodhistory
Which I expect would be very quick, given the index. 考虑到指数,我预计会非常快。 But no, 270 secs.
但不,270秒。 So I explain, and I get:
所以我解释一下,然后我得到:
HashAggregate (cost=824870.03..824873.08 rows=305 width=4) (actual time=211834.018..211834.097 rows=315 loops=1)
Output: curve_id
Group Key: prodhistory.curve_id
-> Seq Scan on public.prodhistory (cost=0.00..718003.22 rows=42746722 width=4) (actual time=12.751..200826.299 rows=43218808 loops=1)
Output: curve_id
Planning time: 0.115 ms
Execution time: 211848.137 ms
I'm not to experienced in reading these plans, but a Seq Scan on the DB seems bad. 我没有阅读这些计划的经验,但数据库上的Seq Scan似乎很糟糕。
Any thoughts? 有什么想法吗? I'm sort of stumped.
我有点难过。
This plan is chosen because PostgreSQL thinks it is cheaper. 选择这个计划是因为PostgreSQL认为它更便宜。
You can compare by setting 您可以通过设置进行比较
SET enable_seqscan=off;
and then re-running your EXPLAIN (ANALYZE)
statement. 然后重新运行
EXPLAIN (ANALYZE)
语句。 Compare cost
and actual time
in both cases and check if PostgreSQL estimated correctly or not. 比较两种情况下的
cost
和actual time
,并检查PostgreSQL是否正确估计。
If you find that using an Index Scan
or Index Only Scan
is actually cheaper, you could consider twiddling the cost parameters to match your machine better, eg lower random_page_cost
or cpu_index_tuple_cost
or raise cpu_tuple_cost
. 如果您发现使用
Index Scan
或Index Only Scan
实际上更便宜,您可以考虑使用成本参数来更好地匹配您的机器,例如降低random_page_cost
或cpu_index_tuple_cost
或提高cpu_tuple_cost
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.