[英]SQL - Most frequent value in column of joined tables
I have three tables described below: 我有以下三个表格:
Area (Id, Description)
City(Id, Name)
Problem(Id, City, Area, Definition):
City references City (Id), Area references Area (Id)
I want to find the most frequent value of Area(Description) that appears in Problem for each City (Name). 我想查找在“问题”中出现的每个城市(名称)的“区域(描述)”的最常用值。
Example: 例:
Area
Id Description
1 Support
2 Finance
City
Id Name
1 Chicago
2 Boston
Problem
Id City Area Definition
1 1 2 A
2 1 2 B
3 1 1 C
4 2 1 D
Desired Output: 所需输出:
Name Description
Chicago Finance
Boston Support
Here's what I have tried with no success : 这是我尝试未成功的尝试:
SELECT Name,
Description
FROM
(SELECT *
FROM Problem AS P,
City AS C,
Area AS A
WHERE C.Id = P.City
AND A.Id = P.Area ) AS T1
WHERE Description =
(SELECT Description
FROM
(SELECT *
FROM Problem AS P,
City AS C,
Area AS A
WHERE C.Id = P.City
AND A.Id = P.Area ) AS T2
WHERE T1.Name = T2.Name
GROUP BY Description
ORDER BY Count(Name) DESC LIMIT 1 )
GROUP BY Name,
Description
Thanks! 谢谢!
The Max For each city, and area should be 每个城市和地区的最大值应为
select C.Name, A.Description from (
select t1.City, t1.Area, max(freq) as max_freq
from (
select P.City, P.Area, count(*) as Freq
from Problem as P
group by P.City, P.Area
) t1
) t2
INNER JOIN City AS C ON t2.City = C.Id
INNER JOIN Area AS A ON A.Id = t2.Area
This is probably the shortest way to solve your issue: 这可能是解决问题的最短方法:
select c.Name, a.Description
from City c
cross join Area a
where a.Id = (
select p.Area
from Problem p
where p.City = c.Id
group by p.Area
order by count(*) desc, p.Area asc
limit 1
)
We use a CROSS JOIN to combine every City
with every Area
. 我们使用CROSS JOIN将每个
City
和每个Area
结合起来。 But we pick only the Area
with the highest count in the Problem
table for the given city, which is determined in the correlated subquery. 但是我们只在给定城市的“
Problem
表中选择计数最高的Area
,这是在相关子查询中确定的。 If two areas have the same highest count for a city, the one coming first alphabetically will be picked ( order by ... p.Area asc
). 如果两个地区的城市最高计数相同,则将按字母顺序选择
order by ... p.Area asc
( order by ... p.Area asc
)。
Result: 结果:
| Name | Description |
|---------|-------------|
| Boston | Support |
| Chicago | Finance |
Here's another more complex solution which includes the count. 这是另一个更复杂的解决方案,其中包括计数。
select c.Name, a.Description, city_area_maxcount.mc as problem_count
from (
select City, max(c) as mc
from (
select p.City, p.Area, count(*) as c
from problem p
group by p.City, p.Area
) city_area_count
group by City
) city_area_maxcount
join (
select p.City, p.Area, count(*) as c
from problem p
group by p.City, p.Area
) city_area_count
on city_area_count.City = city_area_maxcount.City
and city_area_count.c = city_area_maxcount.mc
join City c on c.Id = city_area_count.City
join Area a on a.Id = city_area_count.Area
The subquery alisaed as city_area_maxcount
is used twice here (i hope mysql can cache the result). 别名为
city_area_maxcount
的子查询在这里使用了两次(我希望mysql可以缓存结果)。 If you think of it as a table, that would be a common find-the-row-with-top-value-per-group problem. 如果您将其视为表格,那将是一个常见的“每组最高价值行查找”问题。 If two areas have the same highest count for a city, both will be selected.
如果两个区域的城市最高计数相同,则将同时选择两个区域。
Result: 结果:
| Name | Description | problem_count |
|---------|-------------|---------------|
| Boston | Support | 1 |
| Chicago | Finance | 2 |
Demo: http://sqlfiddle.com/#!9/c66a5/2 演示: http : //sqlfiddle.com/#!9/c66a5/2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.