简体   繁体   中英

Query optimization in PostgreSQL

I have rather simple query with big total runtime. Can you advise me how can I optimize it?

Here is explain analyze: http://explain.depesz.com/s/9xC5

query:

select wpis_id from spoleczniak_oznaczone
where etykieta_id in(
  select tag_id
  from spoleczniak_subskrypcje
  where postac_id = 376476
  );

spoleczniak_oznaczone:

 Column    |  Type   |                             Modifiers
-------------+---------+--------------------------------------------------------------------
 id          | integer | not null default nextval('spoleczniak_oznaczone_id_seq'::regclass)
 etykieta_id | integer | not null
 wpis_id     | integer | not null
Indexes:
    "spoleczniak_oznaczone_pkey" PRIMARY KEY, btree (id)
    "spoleczniak_oznaczone_etykieta_id" btree (etykieta_id)
    "spoleczniak_oznaczone_wpis_id" btree (wpis_id)
Foreign-key constraints:
    "spoleczniak_oznaczone_etykieta_id_fkey" FOREIGN KEY (etykieta_id) REFERENCES    spoleczniak_etykiety(id) DEFERRABLE INITIALLY DEFERRED
    "spoleczniak_oznaczone_wpis_id_fkey" FOREIGN KEY (wpis_id) REFERENCES spoleczniak_tablica(id) DEFERRABLE INITIALLY DEFERRED

spoleczniak_subskrypcje:

  Column   |  Type   |                              Modifiers
-----------+---------+----------------------------------------------------------------------
 id        | integer | not null default nextval('spoleczniak_subskrypcje_id_seq'::regclass)
 postac_id | integer | not null
 tag_id    | integer | not null
Indexes:
    "spoleczniak_subskrypcje_pkey" PRIMARY KEY, btree (id)
    "spoleczniak_subskrypcje_postac_id" btree (postac_id)
    "spoleczniak_subskrypcje_postac_tag" btree (postac_id, tag_id)
    "spoleczniak_subskrypcje_tag_id" btree (tag_id)
Foreign-key constraints:
    "spoleczniak_subskrypcje_postac_id_fkey" FOREIGN KEY (postac_id) REFERENCES postac_postacie(id) DEFERRABLE INITIALLY DEFERRED
    "spoleczniak_subskrypcje_tag_id_fkey" FOREIGN KEY (tag_id) REFERENCES spoleczniak_etykiety(id) DEFERRABLE INITIALLY DEFERRED

From the Query Plan, most of the time seems to be involved in working out the IN part of the where clause. Proper indexes seem to be used.

select o.wpis_id 
from spoleczniak_oznaczone o
inner join spoleczniak_subskrypcje s on s.tag_id = o.etykieta_id
where s.postac_id = 376476

...looks to be functionally the same but tries it in a different way and could generate a different query plan.

Also, as @wildplasser says, make sure statistics are up-to-date, and indexes defragmented (don't know how to do those in PostgreSQL myself).

EDIT: as @a_horse_with_no_name says in the comment below, the query I've suggested can return duplicates where the original wouldn't. Without knowing your data I don't know whether it will or not. That's a warning to bear in mind.

Is there a reason you preferred using in and a subquery to:

select wpis_id 
from spoleczniak_oznaczone, spoleczniak_subskrypcje
where etykieta_id = tag_id 
and postac_id = 376476

I would guess a simple join might be simpler for the query optimiser.

This should be equivalent (and in most cases will generate the same query plan)

SELECT so.wpis_id
FROM spoleczniak_oznaczone so
WHERE EXISTS (
  SELECT *
  FROM spoleczniak_subskrypcje ss
  WHERE ss.tag_id= so.etykieta_id
  AND so.postac_id = 376476
  );

Try replacing this index:

"spoleczniak_oznaczone_etykieta_id" btree (etykieta_id)

with an index on (etykieta_id, wpis_id) . This way DB could perform index-only scan (without fetching whole rows from table which costs access time).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM