ST_DWITHIN不使用GIST或BRIN索引

Question

我正在使用postgis函數ST_DWithin（地理gg1，地理gg2，雙精度distance_meters），以查找點是否在距多邊形的指定距離內。 我正在運行測試以查看查詢需要花費多長時間，並且解釋說明該查詢正在表上運行順序掃描，而不是使用BRIN或GIST索引。 有人可以建議一種優化方法。

這是表格-

table1（incident_geog）與多邊形

CREATE TABLE public.incident_geog
(
    incident_id integer NOT NULL DEFAULT nextval('incident_geog_incident_id_seq'::regclass),
    incident_name character varying(20) COLLATE pg_catalog."default",
    incident_span geography(Polygon,4326),
    CONSTRAINT incident_geog_pkey PRIMARY KEY (incident_id)
)

CREATE INDEX incident_geog_gix
    ON public.incident_geog USING gist
    (incident_span)

table2帶有點和距離（watchzones_geog）

CREATE TABLE public.watchzones_geog
(
    id integer NOT NULL DEFAULT nextval('watchzones_geog_id_seq'::regclass),
    date_created timestamp with time zone DEFAULT now(),
    latitude numeric(10,7) DEFAULT NULL::numeric,
    longitude numeric(10,7) DEFAULT NULL::numeric,
    radius integer,
    "position" geography(Point,4326),
    CONSTRAINT watchzones_geog_pkey PRIMARY KEY (id)
)

CREATE INDEX watchzones_geog_gix
    ON public.watchzones_geog USING gist
    ("position")

帶有st_dwithin的SQL

explain select i.incident_id,wz.id from watchzones_geog wz, incident_geog i where ST_DWithin(position,incident_span,wz.radius * 1000);

輸出說明：

Nested Loop  (cost=0.26..418436.69 rows=1 width=8)
-> Seq Scan on watchzones_geog wz  (cost=0.00..13408.01 rows=600001 width=40)
 ->  Index Scan using incident_geog_gix on incident_geog i  (cost=0.26..0.67 rows=1 width=292)
        Index Cond: (incident_span && _st_expand(wz."position", ((wz.radius * 1000))::double precision))
        Filter: ((wz."position" && _st_expand(incident_span, ((wz.radius * 1000))::double precision)) AND _st_dwithin(wz."position", incident_span, ((wz.radius * 1000))::double precision, true))

Answer 1

您的SQL實際執行的操作是在每個點的指定距離內找到一些多邊形。 結果之一之間一一對應incident_geog.incident_id和watchzones_geog.id 。 因為您在每個點上都進行操作，所以它使用順序掃描。

我想您想從Polygon開始尋找點。 因此，您的SQL需要更改表。

explain select i.incident_id,wz.id from incident_geog i, watchzones_geog wz where ST_DWithin(position,incident_span,50);

我們可以看到：

Nested Loop  (cost=0.27..876.00 rows=1 width=16)
   ->  Seq Scan on incident_geog i  (cost=0.00..22.00 rows=1200 width=40)
   ->  Index Scan using watchzones_geog_gix on watchzones_geog wz  (cost=0.27..0.70 rows=1 width=40)
         Index Cond: ("position" && _st_expand(i.incident_span, '50'::double precision))
         Filter: ((i.incident_span && _st_expand("position", '50'::double precision)) AND _st_dwithin("position", i.incident_span, '50'::double precision, true))

因為您操作每個訂單，所以總會有一個表通過順序掃描遍歷所有記錄。 這兩個SQL的結果沒有不同。 關鍵是您在哪個表中開始尋找另一個表的順序。

也許您可以嘗試Parallel Query 。 不要使用Parallel Query ：

SET parallel_tuple_cost TO 0;
explain analyze select i.incident_id,wz.id from incident_geog i, watchzones_geog wz where ST_DWithin(position,incident_span,50);

Nested Loop  (cost=0.27..876.00 rows=1 width=16) (actual time=0.002..0.002 rows=0 loops=1)
   ->  Seq Scan on incident_geog i  (cost=0.00..22.00 rows=1200 width=40) (actual time=0.002..0.002 rows=0 loops=1)
   ->  Index Scan using watchzones_geog_gix on watchzones_geog wz  (cost=0.27..0.70 rows=1 width=40) (never executed)
         Index Cond: ("position" && _st_expand(i.incident_span, '50'::double precision))
         Filter: ((i.incident_span && _st_expand("position", '50'::double precision)) AND _st_dwithin("position", i.incident_span, '50'::double precision, true))
 Planning time: 0.125 ms
 Execution time: 0.028 ms

嘗試Parallel Query並將parallel_tuple_cost設置為2：

SET parallel_tuple_cost TO 2;
explain analyze select i.incident_id,wz.id from incident_geog i, watchzones_geog wz where ST_DWithin(position,incident_span,50);

Nested Loop  (cost=0.27..876.00 rows=1 width=16) (actual time=0.002..0.002 rows=0 loops=1)
       ->  Seq Scan on incident_geog i  (cost=0.00..22.00 rows=1200 width=40) (actual time=0.001..0.001 rows=0 loops=1)
       ->  Index Scan using watchzones_geog_gix on watchzones_geog wz  (cost=0.27..0.70 rows=1 width=40) (never executed)
             Index Cond: ("position" && _st_expand(i.incident_span, '50'::double precision))
             Filter: ((i.incident_span && _st_expand("position", '50'::double precision)) AND _st_dwithin("position", i.incident_span, '50'::double precision, true))
     Planning time: 0.103 ms
     Execution time: 0.013 ms

Answer 2

一些一般要點：

使用IDENTITY COLUMNS，而不是手動設置序列。
您不需要DEFAULT null::可空列的默認值始終為null 。
加載它們后，請確保VACUUM ANALAYZE兩個表進行VACUUM ANALAYZE 。

不要使用SQL-89，而是寫出您的INNER JOIN ... ON

 SELECT i.incident_id,wz.id FROM watchzones_geog wz INNER JOIN incident_geog i ON ST_DWithin(wz.position,i.incident_span,50);

在您的explain analyze ，您的查詢中有一個wz.radius * 1000 ，半徑為50。這是什么？ 如果您靜態輸入半徑，查詢seq會掃描嗎？
如果您不在表格上使用經度和緯度，請刪除這兩列。 沒有理由將它們存儲兩次。
我不會使用varchar(20)而只是使用text它會更快，因為沒有長度檢查，並且實現方式相同。

ST_DWITHIN不使用GIST或BRIN索引

問題描述

2 個解決方案

解決方案1
1 2018-09-12 07:53:30

解決方案2
0 2018-09-12 08:01:44

ST_DWITHIN不使用GIST或BRIN索引

問題描述

2 個解決方案

解決方案1 1 2018-09-12 07:53:30

解決方案2 0 2018-09-12 08:01:44

解決方案1
1 2018-09-12 07:53:30

解決方案2
0 2018-09-12 08:01:44