简体   繁体   中英

How can I query a many-to-many join with filtering on the same relation?

I simplified a many-to-many relationship case with these mockup tables.

Posts:
------------------------------
|   id |    title |     body |
------------------------------
|    1 |      One |    text1 |
|    2 |      Two |    text2 |
|    3 |    Three |    text3 |
------------------------------

Tags:
-------------------
|   id |     name |
-------------------
|    1 |      SQL |
|    2 |     GLSL |
|    3 |      PHP |
-------------------

Post_tags:
------------------------------
|   id |    p_id |      t_id |
------------------------------
|    1 |       1 |         1 |
|    2 |       1 |         3 |
|    3 |       2 |         1 |
|    3 |       3 |         2 |
------------------------------

My goal is to query POSTS with specific TAGS, which I have no problem with, but I also want to display all related tags to the post not just the one I queried for. My query looks like this:

SELECT p.Title, p.Body, t.name
FROM Posts p
LEFT JOIN Post_tags pt ON p.id = pt.p_id
LEFT JOIN Tags t ON t.id = pt.t_id
WHERE t.name LIKE '%SQL%'

This gets the posts with the "SQL" tag, but it only joins the posts table with tags where it found the "SQL" string, so the other tag "PHP" associated with the post doesn't get joined.

Obviously the problem is I'm joining the table on the WHERE clause, but how would one solve this in one query or (preferably with subqueries)?

Currently I'm doing this in two separate queries in my application, one for selecting matching posts and another one that is retreiving full post data. This isn't so efficient and also seems like a lame solution, and I haven't found a better yet, so I decided to ask the StackOverflow community.

The most concise (might be fast) I can think of:

select p.*, '' as x, t.name
from Posts p
join Posts_tags pt 
ON  pt.p_id = p.id 
AND pt.p_id in (select p_id 
                from Posts_tags 
                join Tags on Tags.id = Posts_tags.t_id 
                where Tags.name like '%SQL%')
join Tags t on t.id = pt.t_id;

If you need the tags collapsed in one line, use GROUP_CONCAT:

select p.*, group_concat(t.name) as tags
from Posts p
join Posts_tags pt 
ON  pt.p_id = p.id 
AND pt.p_id in (select p_id 
                from Posts_tags 
                join Tags on Tags.id = Posts_tags.t_id 
                where Tags.name like '%SQL%')
join Tags t on t.id = pt.t_id
group by p.id;

Output:

ID  TITLE   BODY    TAGS
1   One     text1   SQL,PHP
2   Two     text2   SQL

Live test: http://www.sqlfiddle.com/#!2/52b3b/2


UPDATE

There's a solution more optimized than this, see here: https://stackoverflow.com/a/10471529

My old answer is not the shortest, here's the shortest one:

select p.*, '' as x, t.name, t.name like '%SQL%'
from Posts p
join Posts_tags pt on pt.p_id = p.id 
join Tags t on t.id = pt.t_id;

Output:

ID  TITLE   BODY    X       NAME    T.NAME LIKE '%SQL%'
1   One     text1           SQL     1
1   One     text1           PHP     0
2   Two     text2           SQL     1
3   Three   text3           GLSL    0

So if we group by ID, and check that if at least one (aided by bit_or; Postgresql has this too, aptly named bool_or) of the elements in the group satisfied the '%SQL%' criteria, its bit is ON (aka boolean = true). We can pick that group and we retain all the tags under that group, example, tag id 1 appear on post 1, and post 1 has other tags, which is #3 or PHP. All tags that belong to same Post ID will not be discarded, as we will not be using WHERE filter, we will be using HAVING filter instead:

select p.*, group_concat(t.name) as tags
from Posts p
join Posts_tags pt on pt.p_id = p.id 
join Tags t on t.id = pt.t_id
group by p.id
having bit_or(t.name like '%SQL%');

We can also rewrite that to this:

select p.*, group_concat(t.name) as tags
from Posts p
join Posts_tags pt on pt.p_id = p.id 
join Tags t on t.id = pt.t_id
group by p.id
having sum(t.name like '%SQL%') >= 1;

BIT_OR is like IN , or ANY , so it's more semantic than evaluating things by SUM

Output:

D   TITLE   BODY    TAGS
1   One     text1   PHP,SQL
2   Two     text2   SQL

Live test: http://www.sqlfiddle.com/#!2/52b3b/26


I'm learning so much on stackoverflow. After my old answer, I'm thinking how to make an equivalent shorter code in Postgresql using windowing function(which MySQL don't have) via SUM OVER partition . Then I thought of Postgresql's bool_or , bool_and and every function. Then I remember MySQL has bit_or :-)

The last solution using SUM is just an afterthought, when I thought up that bit_or is just a semantic of at least one is true , then it's obvious that you can use HAVING SUM(condition) >= 1 too. Now it works on all database :-)

I ended up not solving it by windowing function, the solution above now works on all database :-)

Put on a separate inner join for all tags

SELECT p.Title, p.Body, t2.name
FROM Posts p
LEFT JOIN Post_tags pt ON p.id = pt.p_id
LEFT JOIN Tags t ON t.id = pt.t_id
INNER JOIN Post_tags pt2 ON p.id = pt2.p_id
INNER JOIN Tags t2 on ON t2.id = pt2.t_id
WHERE t.name LIKE '%SQL%'

Try this:

SELECT p.Title, p.Body, t.name,GROUP_CONCAT(t2.name) AS `tags`
FROM Posts p
LEFT JOIN Post_tags pt ON p.id = pt.p_id
LEFT JOIN Tags t ON t.id = pt.t_id
JOIN Tags t2 ON t2.id = p.id
WHERE t.name LIKE '%SQL%'

This uses GROUP_CONCAT to create a comma-separated list of tags associated with that particular post. Output for your query:

TITLE BODY   NAME   tags
One   text1  SQL    SQL,GLSL

SQL fiddle: http://sqlfiddle.com/#!2/2f698/9

Yet another way to do this is built around an inner join of posts_tags with itself:

SELECT *
FROM posts_tags pt1
JOIN posts_tags pt2
USING(p_id)
WHERE pt2.t_id = 1;

+------+------+------+
| p_id | t_id | t_id |
+------+------+------+
|    1 |    1 |    1 |
|    1 |    3 |    1 |
|    1 |    4 |    1 |
|    3 |    1 |    1 |
|    3 |    2 |    1 |
|    5 |    1 |    1 |
|    5 |    3 |    1 |
|    7 |    1 |    1 |
+------+------+------+
8 rows in set (0.00 sec)

Without the WHERE clause the inner join would give the full cartesian product (t_id 1, t_id 2) of all tags associated with each post. Applying the WHERE clause to half the cartesian product gives you the "all members of sets containing x" structure you're looking for. (The example above demonstrates that only posts associated with tag id 1 have been retrieved; further, all tags associated with those posts are present as well.) Now it's two more simple joins to fetch the information associated with p_id and t_id:

SELECT title,name
FROM posts_tags pt1
JOIN posts_tags pt2
  ON(pt1.p_id = pt2.p_id)
JOIN posts
  ON(pt1.p_id = posts.id)
JOIN tags
  ON (pt1.t_id = tags.id)
WHERE pt2.t_id = 1;

+---------+--------+
| title   | name   |
+---------+--------+
| first   | php    |
| first   | skiing |
| first   | tuna   |
| third   | php    |
| third   | sql    |
| fifth   | php    |
| fifth   | skiing |
| seventh | php    |
+---------+--------+
8 rows in set (0.01 sec)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM