[英]One-to-many query while limiting based on distinct primary key
I have a table like this: 我有这样一张桌子:
create table images (
image_id serial primary key,
user_id int references users(user_id),
date_created timestamp with time zone
);
I then have a tag table for tags that images can have: 然后,我有一个标签表,用于图像可以具有的标签:
create table images_tags (
images_tag_id serial primary key,
image_id int references images(image_id),
tag_id int references tags(tag_id)
);
To get the results I want, I run a query like this: 为了得到我想要的结果,我运行这样的查询:
select image_id,user_id,tag_id from images left join images_tags using(image_id)
where (?=-1 or user_id=?)
and (?=-1 or tag_id in (?, ?, ?, ?)) --have up to 4 tag_ids to search for
order by date_created desc limit 100;
The problem is, I want to limit based on the number of unique image_id
s because my output will look like this: 问题是,我想根据唯一的
image_id
数限制,因为我的输出将如下所示:
{"images":[
{"image_id":1, "tag_ids":[1, 2, 3]},
....
]}
Notice how I group the tag_id
s into an array for output, even though the SQL returns a row for each tag_id
and image_id
combo. 请注意我将
tag_id
为数组以进行输出,即使SQL为每个tag_id
和image_id
组合返回一行。
So, when I say limit 100
, I want it to apply to 100 unique image_id
s. 所以,当我说
limit 100
,我希望它适用于100个唯一的image_id
。
Maybe you should put one image on each row? 也许你应该在每一行上放一张图片? If that works, you can do:
如果可行,您可以:
select image_id, user_id, string_agg(cast(tag_id as varchar(2000)), ',') as tags
from images left join
images_tags
using (image_id)
where (?=-1 or user_id=?) and
(?=-1 or tag_id in (?, ?, ?, ?)) --have up to 4 tag_ids to search for
group by image_id, user_id
order by date_created desc
limit 100;
If that doesn't work, then use a CTE: 如果这不起作用,那么使用CTE:
with cte as (
select image_id, user_id, tag_id,
dense_rank() over (order by date_created desc) as seqnum
from images left join
images_tags
using (image_id)
where (?=-1 or user_id=?) and
(?=-1 or tag_id in (?, ?, ?, ?)) --have up to 4 tag_ids to search for
)
select *
from cte
where seqnum <= 100
order by seqnum;
Select 100 qualifying images first, and then join images_tags. 首先选择100个合格图像, 然后加入images_tags。
Use an EXISTS
semi-join to satisfy the condition on images_tags and take care to get the parentheses right. 使用
EXISTS
半连接来满足images_tags上的条件,并注意使括号正确。
SELECT i.*, t.tag_id
FROM (
SELECT i.image_id, i.user_id
FROM images i
WHERE (? = -1 OR i.user_id = ?)
AND (? = -1 OR EXISTS (
SELECT 1
FROM images_tags t
WHERE t.image_id = i.image_id
AND t.tag_id IN (?, ?, ?, ?)
))
ORDER BY i.date_created DESC
LIMIT 100
) i
LEFT JOIN images_tags t
ON t.image_id = i.image_id
AND (? = -1 OR t.tag_id in (?, ?, ?, ?)) -- repeat condition
This should be faster than a solution with window functions and CTEs. 这应该比具有窗口函数和CTE的解决方案更快。
Test performance with EXPLAIN ANLAYZE
. 使用
EXPLAIN ANLAYZE
测试性能。 As always run a couple of times to warm up cache. 一如既往地运行几次来预热缓存。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.