简体   繁体   中英

SQL query LEFT OUTER JOIN ads with AVG(rating)

I have a DB with the following tables:

ads
ads_rating
ads_province
ads_promo
etc...

I'm generating a complex query because I need certain ads with the rating average of each one.

In ads_rating table I have and id, rating, user_id, date_rated

How do I JOIN the ads.* table and add a new field called "rating_average" or something like that.

I suppose I must made a SELECT inside a JOIN but I'm newbie at MySQL.

This is my actual functional query:

SELECT
category.id AS category_id,
category.subcat AS category_name,
category.`desc` AS category_desc,
category.`name` AS category_pretty_name,
ads.id,
ads.header,
ads.price,
ads.oldprice,
ads.sellfast,
ads.`hash`,
ads.foto1,
ads.foto2,
ads.foto3,
ads.foto4,
ads.foto5,
ads.user_id,
SUBSTR(ads.body, 1, 160) AS body,
ads.subcat_id,
ads.updated,
ads.created,
ads.email,
ads.`name`,
ads.phone,
ads.hits,
ads.hidden,
promo.promotype AS promo_type,
supercategory.`name` AS supercategory_name,
supercategory.id AS supercategory_id,
ads_rating.rating,
promo.ads_id
FROM `ads` 
JOIN `category` ON `category`.`id` = `ads`.`subcat_id` 
JOIN `supercategory` ON `supercategory`.`id` = `category`.`cat`  
LEFT OUTER JOIN `promo` ON `promo`.`ads_id` = `ads`.`id` 
LEFT OUTER JOIN ads_rating ON `ads_rating`.`ad_id` = `ads`.`id`
WHERE `recycle_bin` != 1 AND `hidden` =0 AND ( `promo`.`promotype` >0 OR `ads`.`user_id` = 20 OR `ads_rating`.`rating` >= 4 )
ORDER BY `promo_type` DESC, `updated` DESC
LIMIT 5000

Just don't know how manage the line:

LEFT OUTER JOIN ads_rating ON `ads_rating`.`ad_id` = `ads`.`id`

Sample Data:

ads
id|header|body|category|etc...
2|Pretty pupy|It's new|puppys|etc
3|Ugly pupy|It's old|puppys|etc

rating
id|ad_id|user_id|rating|rated_date
1|2|568|5|2017-10-2
1|2|570|4|2017-10-3
1|2|594|5|2017-10-1

So, de desired resul set has to be

id|header|body|category|avg_rating
2|Pretty pupy|It's new|puppys|4.6
3|Ugly pupy|It's old|puppys|null

Thanks!

The LEFT [OUTER] JOIN syntax you have used appears to be fine, but we don't have access to any sample data to verify that. However there is an aspect to your query that may be confusing, which is the extra condition you introduce in the where clause AND ads_rating.rating >= 4 .

I think you may find it easier to include that extra condition in the join rather than in the where clause, like this:

select ...

FROM `ads` 
JOIN `category` ON `category`.`id` = `ads`.`subcat_id` 
JOIN `supercategory` ON `supercategory`.`id` = `category`.`cat`  
LEFT OUTER JOIN `promo` ON `promo`.`ads_id` = `ads`.`id` 
LEFT OUTER JOIN ads_rating ON `ads_rating`.`ad_id` = `ads`.`id`
                          AND `ads_rating`.`rating` >= 4
WHERE `recycle_bin` != 1 
AND `hidden` =0 
AND ( `promo`.`promotype` >0 
   OR `ads`.`user_id` = 20
    )
ORDER BY `promo_type` DESC, `updated` DESC
LIMIT 5000

The reason for this is that a LEFT JOIN will allow a row from the ads table to be returned even if there is no corresponding ads_rating information and hence any columns from ads_rating will be NULL when that happens. eg

 ads.id ads_rating.rating
 1      4
 2      NULL

If your where clause includes AND ads_rating.rating >= 4 then id 2 woould be excluded from the final result, and hence that predicate effectively makes the LEFT JOIN into an equivalent of an INNER JOIN.

So. When using any OUTER joins (eg LEFT OUTER JOIN) be very wary of referring to those tables in the where clause. Often it is simpler to put those extra conditions into the join instead.

So if you join ads directly to this table you will get a row for every rating that exists for a given ad. So if you want just an average rating and not every single rating then you will want to change ads_rating.rating to AVG(ads_rating.rating) and group by everything else. This will give you the average rating per ad.

SELECT
category.id AS category_id,
category.subcat AS category_name,
category.desc AS category_desc,
category.name AS category_pretty_name,
ads.id,
ads.header,
ads.price,
ads.oldprice,
ads.sellfast,
ads.hash,
ads.foto1,
ads.foto2,
ads.foto3,
ads.foto4,
ads.foto5,
ads.user_id,
SUBSTR(ads.body, 1, 160) AS body,
ads.subcat_id,
ads.updated,
ads.created,
ads.email,
ads.name,
ads.phone,
ads.hits,
ads.hidden,
promo.promotype AS promo_type,
supercategory.name AS supercategory_name,
supercategory.id AS supercategory_id,
**AVG(ads_rating.rating) as rating_average,**
promo.ads_id
FROM ads 
inner join category ON category.id = ads.subcat_id 
inner join supercategory ON supercategory.id = category.cat  
LEFT OUTER JOIN promo ON promo.ads_id = ads.id 
LEFT OUTER JOIN ads_rating ON ads_rating.ad_id = ads.id
WHERE recycle_bin != 1 AND hidden = 0 AND ( promo.promotype > 0 OR ads.user_id = 20 OR ads_rating.rating >= 4 )
GROUP BY
category.id AS category_id,
category.subcat AS category_name,
category.desc AS category_desc,
category.name AS category_pretty_name,
ads.id,
ads.header,
ads.price,
ads.oldprice,
ads.sellfast,
ads.hash,
ads.foto1,
ads.foto2,
ads.foto3,
ads.foto4,
ads.foto5,
ads.user_id,
SUBSTR(ads.body, 1, 160) AS body,
ads.subcat_id,
ads.updated,
ads.created,
ads.email,
ads.name,
ads.phone,
ads.hits,
ads.hidden,
promo.promotype AS promo_type,
supercategory.name AS supercategory_name,
supercategory.id AS supercategory_id
ORDER BY promo_type DESC, updated DESC
LIMIT 5000

Another option would be to actually join two separate queries...see below. Effectively what you are doing is taking 2 queries and joining them as if they were tables. The second is less optimal but can be useful if you are aggregating different data from different sources, for example, if you were pulling an average from both promo and from rating and needed to show them in the results of 1 query.

select q1.*, q2.average_rating 
from
(
SELECT
category.id AS category_id,
category.subcat AS category_name,
category.desc AS category_desc,
category.name AS category_pretty_name,
ads.id as ads_id,
ads.header,
ads.price,
ads.oldprice,
ads.sellfast,
ads.hash,
ads.foto1,
ads.foto2,
ads.foto3,
ads.foto4,
ads.foto5,
ads.user_id,
SUBSTR(ads.body, 1, 160) AS body,
ads.subcat_id,
ads.updated,
ads.created,
ads.email,
ads.name,
ads.phone,
ads.hits,
ads.hidden,
promo.promotype AS promo_type,
supercategory.name AS supercategory_name,
supercategory.id AS supercategory_id
FROM ads 
inner join category ON category.id = ads.subcat_id 
inner join supercategory ON supercategory.id = category.cat  
LEFT OUTER JOIN promo ON promo.ads_id = ads.id 
LEFT OUTER JOIN ads_rating ON ads_rating.ad_id = ads.id
WHERE recycle_bin != 1 AND hidden = 0 AND ( promo.promotype > 0 OR ads.user_id = 20 OR ads_rating.rating >= 4 )
) q1
LEFT OUTER JOIN 
(
select ads_id, avg(rating) as average_rating  from ads_rating 
group by ads_id
) q2
on q1.ads_id = q2.ads_id
ORDER BY promo_type DESC, updated DESC
LIMIT 5000

Hopefully this helps. I apologize if there are any typos. I did this in notepad. Kick back any errors and I will send you a fix

By now this is the best solution I can find, thanks to @beautiful.drifter

SELECT q1.*, q2.average_rating FROM (
                    SELECT category.id AS category_id,
                    category.subcat AS category_name,
                    category.desc AS category_desc,
                    category.name AS category_pretty_name,
                    ads.id as ads_id,
                    ads.header,
                    ads.price,
                    ads.oldprice,
                    ads.sellfast,
                    ads.hash,
                    ads.foto1,
                    ads.foto2,
                    ads.foto3,
                    ads.foto4,
                    ads.foto5,
                    ads.user_id,
                    SUBSTR(ads.body, 1, 160) AS body,
                    ads.subcat_id,
                    ads.updated,
                    ads.created,
                    ads.email,
                    ads.name,
                    ads.phone,
                    ads.hits,
                    ads.hidden,
                    ads.recycle_bin,
                    promo.promotype AS promo_type,
                    supercategory.name AS supercategory_name,
                    supercategory.id AS supercategory_id
                    FROM ads 
                    INNER JOIN category ON category.id = ads.subcat_id 
                    INNER JOIN supercategory ON supercategory.id = category.cat  
                    LEFT OUTER JOIN promo ON promo.ads_id = ads.id 
                    ) q1
                    LEFT OUTER JOIN (
                    SELECT ad_id, avg(rating) as average_rating from ads_rating 
                    group by ad_id
                    ) q2
                    ON q1.ads_id = q2.ad_id
                    WHERE q1.recycle_bin != 1 AND q1.hidden = 0 AND ( q1.promo_type > 0 OR q2.average_rating >= 4 )
                    ORDER BY promo_type DESC, updated DESC

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM