I have three separate tables - pages
, tags
, and pages_tagged
- that contain page content , tag names and ids , and page ids with tag ids respectively.
I'm trying to set up a MySQL query that takes the search term and checks for an existing tag, finds the matching tag ID, and returns all pages with said tag - I've got this working well. However, when I try to extend it further to also query matching string within the title column of the pages table, things go a bit belly-up.
My SQL is as follows:
SELECT tags.id, pages_tagged.page_id, pages.id, pages.randomId, pages.title,
DATE_FORMAT( pages.dateAdded, '%M %e, %Y' ) AS dateAdded,
pages.viewcount, pages.sessionId
FROM tags JOIN pages_tagged ON tags.id = pages_tagged.tag_id
JOIN pages ON pages_tagged.page_id = pages.randomId
WHERE (tags.tag = 'ovechkin' OR pages.title LIKE '%ovechkin%')
ORDER BY dateAdded DESC
I know that the order of operations here is very wrong, but I can't wrap my head around the correct way to modify this query to make it work correctly.
Would anyone be able to point out my glaring errors?
Edit:
To clarify "belly-up," when the query is run, it's "successful." However, no rows are ever returned.
Modifying the WHERE clause as follows to isolate the pages.title LIKE '%ovechkin%' never results in returned rows, no matter what the search term is.
WHERE (pages.title LIKE '%ovechkin%')
Edit 2:
Sample data below.
pages
╔════╦════════════════════════╦═════════════════════╦══════════╦═══════════╗
║ id ║ title ║ dateAdded ║ randomId ║ viewcount ║
╠════╬════════════════════════╬═════════════════════╬══════════╬═══════════╣
║ 57 ║ Ovechkin looping about ║ 2013-04-07 19:26:06 ║ xp3rvju ║ 5 ║
╚════╩════════════════════════╩═════════════════════╩══════════╩═══════════╝
tags
╔════════╦══════════╗
║ id ║ tag ║
╠════════╬══════════╣
║ 25 ║ ovechkin ║
╚════════╩══════════╝
pages_tagged
╔══════════════════╗
║ tag_id | page_id ║
╠══════════════════╣
║ 25 | xp3rvju ║
║ 25 | mpbjbk6 ║
╚══════════════════╝
Edit 3:
As suggested, a RIGHT JOIN
gets the pages.title
working. The modified query is:
SELECT tags.id, pages_tagged.page_id, pages.id, pages.randomId, pages.title,
DATE_FORMAT( pages.dateAdded, '%M %e, %Y' ) AS dateAdded,
pages.viewcount, pages.sessionId
FROM tags RIGHT JOIN pages_tagged ON tags.id = pages_tagged.tag_id
RIGHT JOIN pages ON pages_tagged.page_id = pages.randomId
WHERE (tags.tag = 'ovechkin' OR pages.title LIKE '%ovechkin%')
ORDER BY dateAdded DESC
A remaining concern is that if a page has the same search term in both its title and an associated tag, it'll return twice. I've tried modifying it to include DISTINCT
on the select, as follows, but this doesn't have an impact on the returned rows.
SELECT DISTINCT tags.id, pages_tagged.page_id, pages.id, pages.randomId, pages.title,
DATE_FORMAT( pages.dateAdded, '%M %e, %Y' ) AS dateAdded,
Edit 4:
May as well include the final solution to prevent duplicates - GROUP BY
.
SELECT tags.id, pages_tagged.page_id, pages.id, pages.randomId, pages.title,
DATE_FORMAT( pages.dateAdded, '%M %e, %Y' ) AS dateAdded,
pages.viewcount, pages.sessionId
FROM pages
LEFT JOIN pages_tagged ON pages.randomId = pages_tagged.page_id
LEFT JOIN tags ON tags.id = pages_tagged.tag_id
WHERE (tags.tag = 'ovechkin' OR pages.title LIKE '%ovechkin%')
GROUP BY pages.randomId
ORDER BY dateAdded DESC
Try following:
select * from
(SELECT tags.id as tid, pages_tagged.page_id, pages.id, pages.randomId, pages.title,
DATE_FORMAT( pages.dateAdded, '%M %e, %Y' ) AS dateAdded,
pages.viewcount, pages.sessionId
FROM tags JOIN pages_tagged ON tags.id = pages_tagged.tag_id
JOIN pages ON pages_tagged.page_id = pages.randomId
WHERE tags.tag = 'thang'
union
( SELECT tags.id as tid, pages_tagged.page_id, pages.id, pages.randomId, pages.title,
DATE_FORMAT( pages.dateAdded, '%M %e, %Y' ) AS dateAdded,
pages.viewcount, pages.sessionId
FROM pages JOIN pages_tagged on pages_tagged.page_id = pages.randomId
JOIN tags ON tags.id = pages_tagged.tag_id
WHERE pages.title LIKE '%thang%'
)
) as a
ORDER BY a.dateAdded DESC
That way it will choose first "correct" tags after that "correct" pages.
It is important to understand how joins work in general, here is a simple explanation: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
Since you are are looking for a tag OR
a match in the title, I'm guessing you want to be using a RIGHT JOIN
for the relation, because this query will not return any pages that aren't tagged.
Currently, if you have a page with the title ovechkin
but no tags, you won't find it using this query.
This is what I tried: http://sqlfiddle.com/#!2/c25c5/2
Generally, the way the query is built means that you are getting all tags, then joining any tagged pages. The behaviour without the WHERE
clause is as follows:
Doing a normal JOIN
will just return tagged pages, if there are no tags in the database you wont get a single row.
Using a LEFT JOIN
means you get a result for each tag, even if no pages are tagged.
Using RIGHT JOIN
means you will get all result row for all the pages, even if there are no tags, or no pages are tagged.
For all of these, any fields where there is no data will be filled with null.
I would recommend changing the query to this (certain fields removed for readability):
SELECT tags.id, pages_tagged.page_id, pages.id, pages.randomId
FROM pages
LEFT JOIN pages_tagged ON pages.randomId = pages_tagged.page_id
LEFT JOIN tags ON tags.id = pages_tagged.tag_id
WHERE (tags.tag = 'ovechkin' OR pages.title LIKE '%ovechkin%')
You will get the same page more than once if it has more than one tag.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.