簡體   English   中英

POSTGRESQL:如何針對substring列優化索引?

[英]POSTGRESQL: How to optimize index for substring of a column?

如何優化列的 substring 的索引?

例如,有一列 postal_code 存儲 5 個字符的字符串。 如果我的大部分查詢都對前 2 個字符進行過濾,則在此列上有索引是沒有用的。

如果我只在 substring 上創建索引怎么辦: CREATE INDEX ON index.annonces_parsed (left(postal_code, 2))

這是一個好的解決方案,還是添加一個僅存儲 substring 並在其上有索引的新列更好?

使用該索引的查詢可以是:

select *
from index.cities
where left(postal_code, 2) = '83' --- Will it use the index on the substring ?

非常感謝

我有一個有 2000 萬條記錄的test表。

測試 1

CREATE INDEX test_a1_idx ON test (a1)

explain analyze 
select * from test 
where left(a1, 2) = '58'

Gather  (cost=1000.00..103565.05 rows=40000 width=12) (actual time=0.429..468.428 rows=89712 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Parallel Seq Scan on test  (cost=0.00..98565.05 rows=16667 width=12) (actual time=0.114..407.330 rows=29904 loops=3)
        Filter: ("left"(a1, 2) = '58'::text)
        Rows Removed by Filter: 2636765
Planning Time: 0.424 ms
Execution Time: 470.472 ms


explain analyze 
select * from test 
where a1 like '58%'

Gather  (cost=1000.00..99284.01 rows=80523 width=12) (actual time=0.990..337.339 rows=89712 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Parallel Seq Scan on test  (cost=0.00..90231.71 rows=33551 width=12) (actual time=0.233..278.740 rows=29904 loops=3)
        Filter: (a1 ~~ '58%'::text)
        Rows Removed by Filter: 2636765
Planning Time: 0.092 ms
Execution Time: 339.259 ms

測試 2

CREATE INDEX test_a1_idx1 ON test (left(a1, 2))

explain analyze 
select * from test 
where left(a1, 2) = '58'

Bitmap Heap Scan on test  (cost=446.43..49455.46 rows=40000 width=12) (actual time=10.507..206.800 rows=89712 loops=1)
  Recheck Cond: ("left"(a1, 2) = '58'::text)
  Heap Blocks: exact=38298
  ->  Bitmap Index Scan on test_a1_idx1  (cost=0.00..436.43 rows=40000 width=0) (actual time=5.450..5.450 rows=89712 loops=1)
        Index Cond: ("left"(a1, 2) = '58'::text)
Planning Time: 0.501 ms
Execution Time: 209.217 ms

explain analyze 
select * from test 
where a1 like '58%'

Gather  (cost=1000.00..99284.01 rows=80523 width=12) (actual time=0.341..334.759 rows=89712 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Parallel Seq Scan on test  (cost=0.00..90231.71 rows=33551 width=12) (actual time=0.110..287.313 rows=29904 loops=3)
        Filter: (a1 ~~ '58%'::text)
        Rows Removed by Filter: 2636765
Planning Time: 0.067 ms
Execution Time: 336.762 ms

結果:

需要注意的是,當我們在conditions中使用任意function時,DB並沒有使用索引。 出於這個原因,函數索引為這些情況提供了非常好的性能。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM