简体   繁体   English

多对多查询

[英]many-to-many query

I have a problem and I dont know what is better solution. 我有一个问题,我不知道什么是更好的解决方案。 Okay, I have 2 tables: posts(id, title), posts_tags(post_id, tag_id). 好的,我有2个表:帖子(id,title),posts_tags(post_id,tag_id)。 I have next task: must select posts with tags ids for example 4, 10 and 11. Not exactly, post could have any other tags at the same time. 我有下一个任务:必须选择带有标签ID的帖子,例如4,10和11.不完全是,post可以同时包含任何其他标签。 So, how I could do it more optimized? 那么,我怎么能更优化呢? Creating temporary table in each query? 在每个查询中创建临时表? Or may be some kind of stored procedure? 或者可能是某种存储过程? In the future, user could ask script to select posts with any count of tags (it could be 1 tag only or 10 at the same time) and I must be sure that method that I will choose would be the best method for my problem. 将来,用户可以要求脚本选择任何标签数量的帖子(它可能只有1个标签或10个同时),我必须确保我选择的方法是解决我问题的最佳方法。 Sorry for my english, thx for attention. 对不起我的英文,请注意。

This solution assumes that (post_id, tag_id) in post_tags is enforced to be UNIQUE: 此解决方案假定post_tags中的(post_id,tag_id)强制为UNIQUE:

 SELECT id, title FROM posts
    INNER JOIN post_tag ON post_tag.post_id = posts.id
    WHERE tag_id IN (4, 6, 10)
    GROUP BY id, title
    HAVING COUNT(*) = 3

Although it's not a solution for all possible tag combinations, it's easy to create as dynamic SQL. 虽然它不是所有可能的标记组合的解决方案,但它很容易创建为动态SQL。 To change for other sets of tags, change the IN () list to have all the tags, and the COUNT(*) = to check for the number of tags specified. 要更改其他标记集,请更改IN()列表以包含所有标记,并使用COUNT(*)=检查指定的标记数。 The advantage of this solution over cascading a bunch of JOINs together is that you don't have to add JOINs, or even extra WHERE terms, when you change the request. 此解决方案优于将一堆JOIN级联在一起的优点是,当您更改请求时,您不必添加JOIN,甚至是额外的WHERE术语。

select id, title
from posts p, tags t
where p.id = t.post_id
and tag_id in ( 4,10,11 ) ;

?

Does this work? 这有用吗?

select *
from posts
where post.post_id in
    (select post_id
    from post_tags
    where tag_id = 4
    and post_id in (select post_id
                    from post_tags
                    where tag_id = 10
                    and post_id in (select post_id
                                    from post_tags
                                    where tag_id = 11)))

You can do a time-storage trade-off by storing a one-way hash of the post's tag names sorted alphabetically. 您可以通过存储按字母顺序排序的帖子标签名称的单向哈希来进行时间存储权衡。

When a post is tagged, execute select t.name from tags t inner join post_tags pt where pt.post_id = [ID_of_tagged_post] order by t.name . 当标记帖子时, select t.name from tags t inner join post_tags pt where pt.post_id = [ID_of_tagged_post] order by t.name执行select t.name from tags t inner join post_tags pt where pt.post_id = [ID_of_tagged_post] order by t.name Concatenate all of the tag names, create a hash using the MD5 algorithm and insert the value into a column alongside your post (or into another table joined by a foreign key, if you prefer). 连接所有标记名称,使用MD5算法创建哈希值,并将值插入到帖子旁边的列中(如果您愿意,还可以插入到由外键连接的另一个表中)。

When you want to search for a specific combination of tags, simply execute (remembering to sort the tag names) select from posts p where p.taghash = MD5([concatenated_tag_string]) . 当您想要搜索特定的标签组合时,只需执行(记住对标签名称进行排序) select from posts p where p.taghash = MD5([concatenated_tag_string])

This selects all posts that have any of the tags (4, 10, 11): 这将选择所有包含任何标签的帖子(4,10,11):

select distinct id, title from posts  
where exists ( 
  select * from posts_tags  
  where  
    post_id = id and 
    tag_id in (4, 10, 11)) 

Or you can use this: 或者您可以使用此:

select distinct id, title from posts   
join posts_tags on post_id = id 
where tag_id in (4, 10, 11) 

(Both will be optimized the same way). (两者都将以相同的方式进行优化)。

This selects all posts that have all of the tags (4, 10, 11): 这将选择包含所有标记的所有帖子(4,10,11):

select distinct id, title from posts
where not exists ( 
  select * from posts_tags t1 
  where 
    t1.tag_id in (4, 10, 11) and
    not exists (
      select * from posts_tags as t2
      where 
        t1.tag_id = t2.tag_id and
        id = t2.post_id))

The list of tags in the in clause is what dynamically changes (in all cases). in子句中的标记列表是动态更改的(在所有情况下)。

But, this last query is not really fast, so you could use something like this instead: 但是,这最后一个查询并不是很快,所以你可以使用这样的东西:

 create temporary table target_tags (tag_id int);
 insert into target_tags values(4),(10),(11);
 select id, title from posts 
   join posts_tags on post_id = id 
   join target_tags on target_tags.tag_id = posts_tags.tag_id
   group by id, title 
   having count(*) = (select count(*) from target_tags);
 drop table target_tags;

The part that changes dynamically is now in the second statement (the insert). 动态更改的部分现在位于第二个语句(插入)中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM