PostgreSQL 8.4; Three tables - store (~100k, pk id, fk supplier_id & item_id), supplier(~10 pk supplier_id), item(~1000 pk item_id);
I created the following query to get the data I need:
SELECT store.quantity, store.price, x.supplier_name
FROM store NATURAL JOIN
(SELECT * FROM item NATURAL JOIN supplier) AS x
WHERE store.price > 500 AND store.quantity > 0 AND
store.quantity < 100 AND
x.item_name = 'SomeName';
The query plan:
Nested Loop (cost=20.76..6513.55 rows=8 width=229)
-> Hash Join (cost=20.76..6511.30 rows=8 width=15)
Hash Cond: (store.item_id = item.item_id)
-> Seq Scan on store (cost=0.00..6459.00 rows=8388 width=23)
Filter: ((price > 500::numeric) AND (quantity > 0) AND (quantity < 100))
-> Hash (cost=20.75..20.75 rows=1 width=8)
-> Seq Scan on item (cost=0.00..20.75 rows=1 width=8)
Filter: ((item_name)::text = 'SomeName'::text)
-> Index Scan using supplier_pkey on supplier (cost=0.00..0.27 rows=1 width=222)
Index Cond: (supplier.supplier_id = store.supplier_id)
Now the aim is to reduce the cost by more than 30% by optimizing the query itself. The only instances of this problem I found were solved by modifying the table or the server settings, but I am looking to do this by modifying nothing else than the query and that's where I fell short in research.
Clearly the issue to be solved is the Seq Scan, which brings me to thinking I need to arrange it so that the scanning/filtering is applied only to a subset of the store table - but iirc you need to scan the table in any such case, so maybe use something else than a Seq Scan? Index scan isn't going to help since I wouldn't be filtering by the index... I'm puzzled here because this seems more of a choice that the PostgreSQL optimizer makes and not something I can change at will...
(If you're wondering, this was part of an assignment and I'm asking here because I have spent quite a few hours researching the problem failing to find anything relevant, and I just gave up on it, but I'm still curious...)
You can probably fix this with indexes. It is a little hard to tell what the keys are because of the "natural join"s. (I recommend using
instead of natural join
so you can at least see what keys are being used and if one of the tables is modified, it won't mess up the join.)
I think an index on item(item_name, item_id)
would help the query plan.
Will be hard to optimize because it looks nice, try this to avoid subquery :
SELECT
store.quantity,
store.price,
supplier.supplier_name
FROM store
INNER JOIN item
ON store.item_id = item.item_id
INNER JOIN supplier
ON supplier.supplier_id = store.supplier_id
AND supplier.item_name = 'SomeName'
WHERE
store.price > 500
AND store.quantity BETWEEN 0 AND 100;
Use BETWEEN
it's better.
Also, add indexes on :
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.