简体   繁体   中英

Stumbling with join and selecting from separate table

I have three separate tables - pages , tags , and pages_tagged - that contain page content , tag names and ids , and page ids with tag ids respectively.

I'm trying to set up a MySQL query that takes the search term and checks for an existing tag, finds the matching tag ID, and returns all pages with said tag - I've got this working well. However, when I try to extend it further to also query matching string within the title column of the pages table, things go a bit belly-up.

My SQL is as follows:

SELECT tags.id, pages_tagged.page_id, pages.id, pages.randomId, pages.title, 
       DATE_FORMAT( pages.dateAdded,  '%M %e, %Y' ) AS dateAdded, 
       pages.viewcount, pages.sessionId 
FROM tags JOIN pages_tagged ON tags.id = pages_tagged.tag_id 
JOIN pages ON pages_tagged.page_id = pages.randomId 
WHERE (tags.tag = 'ovechkin' OR pages.title LIKE '%ovechkin%')
ORDER BY dateAdded DESC

I know that the order of operations here is very wrong, but I can't wrap my head around the correct way to modify this query to make it work correctly.

Would anyone be able to point out my glaring errors?

Edit:

To clarify "belly-up," when the query is run, it's "successful." However, no rows are ever returned.

Modifying the WHERE clause as follows to isolate the pages.title LIKE '%ovechkin%' never results in returned rows, no matter what the search term is.

WHERE (pages.title LIKE '%ovechkin%')

Edit 2:

Sample data below.

pages
╔════╦════════════════════════╦═════════════════════╦══════════╦═══════════╗
║ id ║         title          ║      dateAdded      ║ randomId ║ viewcount ║
╠════╬════════════════════════╬═════════════════════╬══════════╬═══════════╣
║ 57 ║ Ovechkin looping about ║ 2013-04-07 19:26:06 ║ xp3rvju  ║         5 ║
╚════╩════════════════════════╩═════════════════════╩══════════╩═══════════╝

tags
╔════════╦══════════╗
║ id     ║ tag      ║
╠════════╬══════════╣
║     25 ║ ovechkin ║
╚════════╩══════════╝

pages_tagged
╔══════════════════╗
║ tag_id | page_id ║
╠══════════════════╣
║ 25 | xp3rvju     ║
║ 25 | mpbjbk6     ║
╚══════════════════╝

Edit 3:

As suggested, a RIGHT JOIN gets the pages.title working. The modified query is:

SELECT tags.id, pages_tagged.page_id, pages.id, pages.randomId, pages.title, 
   DATE_FORMAT( pages.dateAdded,  '%M %e, %Y' ) AS dateAdded, 
   pages.viewcount, pages.sessionId 
FROM tags RIGHT JOIN pages_tagged ON tags.id = pages_tagged.tag_id 
RIGHT JOIN pages ON pages_tagged.page_id = pages.randomId 
WHERE (tags.tag = 'ovechkin' OR pages.title LIKE '%ovechkin%')
ORDER BY dateAdded DESC    

A remaining concern is that if a page has the same search term in both its title and an associated tag, it'll return twice. I've tried modifying it to include DISTINCT on the select, as follows, but this doesn't have an impact on the returned rows.

SELECT DISTINCT tags.id, pages_tagged.page_id, pages.id, pages.randomId, pages.title, 
   DATE_FORMAT( pages.dateAdded,  '%M %e, %Y' ) AS dateAdded, 

Edit 4:

May as well include the final solution to prevent duplicates - GROUP BY .

SELECT tags.id, pages_tagged.page_id, pages.id, pages.randomId, pages.title, 
   DATE_FORMAT( pages.dateAdded,  '%M %e, %Y' ) AS dateAdded, 
   pages.viewcount, pages.sessionId 
FROM pages 
LEFT JOIN pages_tagged ON pages.randomId = pages_tagged.page_id 
LEFT JOIN tags ON tags.id = pages_tagged.tag_id 
WHERE (tags.tag = 'ovechkin' OR pages.title LIKE '%ovechkin%')
GROUP BY pages.randomId
ORDER BY dateAdded DESC    

Try following:

 select * from
   (SELECT tags.id as tid, pages_tagged.page_id, pages.id, pages.randomId, pages.title, 
           DATE_FORMAT( pages.dateAdded,  '%M %e, %Y' ) AS dateAdded, 
           pages.viewcount, pages.sessionId 
    FROM tags JOIN pages_tagged ON tags.id = pages_tagged.tag_id 
    JOIN pages ON pages_tagged.page_id = pages.randomId 
    WHERE tags.tag = 'thang' 
    union
    ( SELECT tags.id as tid, pages_tagged.page_id, pages.id, pages.randomId, pages.title, 
           DATE_FORMAT( pages.dateAdded,  '%M %e, %Y' ) AS dateAdded, 
           pages.viewcount, pages.sessionId 
     FROM pages JOIN pages_tagged on pages_tagged.page_id = pages.randomId
     JOIN tags ON tags.id = pages_tagged.tag_id   
     WHERE           pages.title LIKE '%thang%'
    )
  ) as a
    ORDER BY a.dateAdded DESC

That way it will choose first "correct" tags after that "correct" pages.

It is important to understand how joins work in general, here is a simple explanation: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html

Since you are are looking for a tag OR a match in the title, I'm guessing you want to be using a RIGHT JOIN for the relation, because this query will not return any pages that aren't tagged.

Currently, if you have a page with the title ovechkin but no tags, you won't find it using this query.

This is what I tried: http://sqlfiddle.com/#!2/c25c5/2

Generally, the way the query is built means that you are getting all tags, then joining any tagged pages. The behaviour without the WHERE clause is as follows:

Doing a normal JOIN will just return tagged pages, if there are no tags in the database you wont get a single row.

Using a LEFT JOIN means you get a result for each tag, even if no pages are tagged.

Using RIGHT JOIN means you will get all result row for all the pages, even if there are no tags, or no pages are tagged.

For all of these, any fields where there is no data will be filled with null.

I would recommend changing the query to this (certain fields removed for readability):

SELECT tags.id, pages_tagged.page_id, pages.id, pages.randomId
FROM pages 
LEFT JOIN pages_tagged ON pages.randomId = pages_tagged.page_id 
LEFT JOIN tags ON tags.id = pages_tagged.tag_id

WHERE (tags.tag = 'ovechkin' OR pages.title LIKE '%ovechkin%')

You will get the same page more than once if it has more than one tag.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM