I've been meddling with databases for a few years now and I'm starting to be pretty decent with most SQL/Postgresql-queries but I still don't understand how a simple FOR-like query should be done in it. Here's an example in pseudocode:
FOR id IN SELECT ids FROM parents WHERE name ilike '%something%' LOOP
SELECT parent_id, max(timestamp) FROM children WHERE parent_id = id;
END LOOP;
Note: One parent can have and often has multiple children so there's a one-to-many-relationship between them.
The desired result of that query should be like:
parent_id, max(timestamp)
5, 2015-09-18 10:00:46.684824+03
6, 2015-09-18 10:00:47.684824+03
8, 2015-09-18 10:00:48.684824+03
etc.
The query itself doesn't have to be a for-loop. I'm just interested in how this query should be expressed in SQL since I quite often seem to have a need for it.
Thanks!
A few ways, some better than others.
I general I advocate learning to think in sets when working with SQL and relational databases. JOIN
s start making lots of sense when you think of them as operations on sets. So do filters like WHERE
and GROUP BY
. You'll often find that you can start expressing your queries in English and just "translate" them to SQL after a while. (Or maybe I just write way, way too much SQL and I'm damaged now).
Using a join and GROUP BY
is in my view the clearest and simplest way to express it. You say "here's the relationship between these two tables, now for each p.ids get me the max(c.timestamp)".
SELECT
p.ids,
max(c.timestamp)
FROM parents
LEFT OUTER JOIN children c ON (p.ids = c.parent_id)
WHERE p.name ILIKE '%something%'
GROUP BY p.ids;
I used a LEFT OUTER JOIN
because, in your simple FOR
loop, you'd get a result with a parent_id and null max
if there were no matching rows. This preserves the same behaviour. If you want no row at all when there are no child rows, use an inner join
.
SELECT
p.ids,
(SELECT max(timestamp) FROM children c WHERE c.parent_id = p.ids)
FROM parents
WHERE p.name ILIKE '%something%';
This approach is limited to cases where you only want one field from the associated child table unless you start doing horrible things with composite records. It'll generally result in the same query plan as the join approach, but it's less flexible.
It's closer to the "for loop" approach, in that it's saying "for each parent row do this on the child table".
FOR
loop in PL/PgSQL This is slowest and is clumsy, but almost literally what you wrote.
FOR id IN SELECT ids FROM parents WHERE name ilike '%something%' LOOP
RETURN QUERY SELECT parent_id, max(timestamp) FROM children WHERE parent_id = id;
END LOOP;
Yes, I copied your code almost verbatim. It looks like perfectly valid PL/PgSQL except that there's no destination for the results. In the form above you'd need to declare the procedure RETURNS TABLE(...)
.
This last one is PL/PgSQL so it's only valid in a function.
It's the closest to what you wrote, and the simplest when thinking procedurally, but it's actually slow and cumbersome.
There are several solutions. You could use join
and a group by
for example. My preferred solution in such a case is the most direct one:
select
id,
(select max(timestamp) from children where parent_id=parents.id)
from parents WHERE name ilike '%something%';
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.