Query optimization in PostgreSQL

Question

I have rather simple query with big total runtime. Can you advise me how can I optimize it?

Here is explain analyze: http://explain.depesz.com/s/9xC5

query:

select wpis_id from spoleczniak_oznaczone
where etykieta_id in(
  select tag_id
  from spoleczniak_subskrypcje
  where postac_id = 376476
  );

spoleczniak_oznaczone:

 Column    |  Type   |                             Modifiers
-------------+---------+--------------------------------------------------------------------
 id          | integer | not null default nextval('spoleczniak_oznaczone_id_seq'::regclass)
 etykieta_id | integer | not null
 wpis_id     | integer | not null
Indexes:
    "spoleczniak_oznaczone_pkey" PRIMARY KEY, btree (id)
    "spoleczniak_oznaczone_etykieta_id" btree (etykieta_id)
    "spoleczniak_oznaczone_wpis_id" btree (wpis_id)
Foreign-key constraints:
    "spoleczniak_oznaczone_etykieta_id_fkey" FOREIGN KEY (etykieta_id) REFERENCES    spoleczniak_etykiety(id) DEFERRABLE INITIALLY DEFERRED
    "spoleczniak_oznaczone_wpis_id_fkey" FOREIGN KEY (wpis_id) REFERENCES spoleczniak_tablica(id) DEFERRABLE INITIALLY DEFERRED

spoleczniak_subskrypcje:

  Column   |  Type   |                              Modifiers
-----------+---------+----------------------------------------------------------------------
 id        | integer | not null default nextval('spoleczniak_subskrypcje_id_seq'::regclass)
 postac_id | integer | not null
 tag_id    | integer | not null
Indexes:
    "spoleczniak_subskrypcje_pkey" PRIMARY KEY, btree (id)
    "spoleczniak_subskrypcje_postac_id" btree (postac_id)
    "spoleczniak_subskrypcje_postac_tag" btree (postac_id, tag_id)
    "spoleczniak_subskrypcje_tag_id" btree (tag_id)
Foreign-key constraints:
    "spoleczniak_subskrypcje_postac_id_fkey" FOREIGN KEY (postac_id) REFERENCES postac_postacie(id) DEFERRABLE INITIALLY DEFERRED
    "spoleczniak_subskrypcje_tag_id_fkey" FOREIGN KEY (tag_id) REFERENCES spoleczniak_etykiety(id) DEFERRABLE INITIALLY DEFERRED

Answer 1

From the Query Plan, most of the time seems to be involved in working out the IN part of the where clause. Proper indexes seem to be used.

select o.wpis_id 
from spoleczniak_oznaczone o
inner join spoleczniak_subskrypcje s on s.tag_id = o.etykieta_id
where s.postac_id = 376476

...looks to be functionally the same but tries it in a different way and could generate a different query plan.

Also, as @wildplasser says, make sure statistics are up-to-date, and indexes defragmented (don't know how to do those in PostgreSQL myself).

EDIT: as @a_horse_with_no_name says in the comment below, the query I've suggested can return duplicates where the original wouldn't. Without knowing your data I don't know whether it will or not. That's a warning to bear in mind.

Answer 2

Is there a reason you preferred using in and a subquery to:

select wpis_id 
from spoleczniak_oznaczone, spoleczniak_subskrypcje
where etykieta_id = tag_id 
and postac_id = 376476

I would guess a simple join might be simpler for the query optimiser.

Answer 3

This should be equivalent (and in most cases will generate the same query plan)

SELECT so.wpis_id
FROM spoleczniak_oznaczone so
WHERE EXISTS (
  SELECT *
  FROM spoleczniak_subskrypcje ss
  WHERE ss.tag_id= so.etykieta_id
  AND so.postac_id = 376476
  );

Answer 4

Try replacing this index:

"spoleczniak_oznaczone_etykieta_id" btree (etykieta_id)

with an index on (etykieta_id, wpis_id) . This way DB could perform index-only scan (without fetching whole rows from table which costs access time).

Query optimization in PostgreSQL

Question

4 answers

solution1
2 2014-03-19 14:54:43

solution2
1 2014-03-19 14:56:39

solution3
1 2014-03-19 15:12:13

solution4
1 2014-03-19 21:04:56

Query optimization in PostgreSQL

Question

4 answers

solution1 2 2014-03-19 14:54:43

solution2 1 2014-03-19 14:56:39

solution3 1 2014-03-19 15:12:13

solution4 1 2014-03-19 21:04:56

solution1
2 2014-03-19 14:54:43

solution2
1 2014-03-19 14:56:39

solution3
1 2014-03-19 15:12:13

solution4
1 2014-03-19 21:04:56