[英]Can I do a max(count(*)) in SQL?
这是我的代码:
select yr,count(*)
from movie
join casting on casting.movieid=movie.id
join actor on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr;
这是问题:
那是“约翰特拉沃尔塔”最忙碌的几年。 显示他每年制作的电影数量。
这是表结构:
movie(id, title, yr, score, votes, director)
actor(id, name)
casting(movieid, actorid, ord)
这是我得到的 output:
yr count(*)
1976 1
1977 1
1978 1
1981 1
1994 1
-- etc.
我需要获取count(*)
最大的行。 我该怎么做呢?
用:
SELECT m.yr,
COUNT(*) AS num_movies
FROM MOVIE m
JOIN CASTING c ON c.movieid = m.id
JOIN ACTOR a ON a.id = c.actorid
AND a.name = 'John Travolta'
GROUP BY m.yr
ORDER BY num_movies DESC, m.yr DESC
按num_movies DESC
排序会将最高值放在结果集的顶部。 如果许多年的计数相同,则m.yr
会将最近的一年放在顶部......直到下一个num_movies
值发生变化。
不,您不能在同一个 SELECT 子句中将聚合函数相互叠加。 内部聚合必须在子查询中执行。 IE:
SELECT MAX(y.num)
FROM (SELECT COUNT(*) AS num
FROM TABLE x) y
只需按count(*) desc
排序,您将获得最高值(如果将其与limit 1
结合使用)
这个问题很旧,但在 dba.SE 上的一个新问题中被引用。 我觉得还没有提供最好的解决方案。 此外,还有新的、更快的选择。
我可以在 SQL 中做一个
max(count(*))
吗?
是的,您可以通过在 窗口函数中嵌套聚合函数来实现:
SELECT m.yr, count(*) AS movie_count
, max(count(*)) OVER () AS max_ct
FROM casting c
JOIN movie m ON c.movieid = m.id
WHERE c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP BY m.yr
ORDER BY count(*) DESC;
db<> 在这里摆弄
那是标准的 SQL 。 Postgres 在 8.4 版中引入了它(发布于 2009-07-01,在提出这个问题之前。其他 RDBMS 应该能够做到这一点。考虑SELECT
查询中的事件序列:
可能的缺点:窗口函数不聚合行。 在聚合步骤之后,您将获得所有行。 在某些查询中很有用,但不适合这个查询。
要获得计数最高的一行,您可以使用ORDER BY ct LIMIT 1
:
SELECT c.yr, count(*) AS ct
FROM actor a
JOIN casting c ON c.actorid = a.id
WHERE a.name = 'John Travolta'
GROUP BY c.yr
ORDER BY ct DESC
LIMIT 1;
仅使用基本 SQL 功能,可在任何半体面的 RDBMS 中使用 - LIMIT
实现各不相同:
或者您可以使用DISTINCT ON
获得每组最高计数的一行(仅限 Postgres):
我需要获取
count(*)
最大的行。
计数最高的行可能不止一行。
SQL Server具有WITH TIES
功能已有一段时间了 - 使用非标准语法:
SELECT TOP 1 WITH TIES
m.yr, count(*) AS movie_count
FROM casting c
JOIN movie m ON c.movieid = m.id
WHERE c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP BY m.yr
ORDER BY count(*) DESC; -- can't sort by year for this
db<> 在这里摆弄
PostgreSQL 13 WITH TIES
标准 SQL 语法添加了WITH TIES
:
SELECT m.yr, count(*) AS movie_count
FROM casting c
JOIN movie m ON c.movieid = m.id
WHERE c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP BY m.yr
ORDER BY count(*) DESC -- can't sort by year for this
FETCH FIRST 1 ROWS WITH TIES;
db<> 在这里摆弄
这应该是最快的查询。 进一步阅读:
要按附加条件对结果进行排序(或对于较旧版本的 Postgres 或其他没有WITH TIES
RDBMS),请在子查询中使用窗口函数rank()
:
SELECT yr, movie_count
FROM (
SELECT m.yr, count(*) AS movie_count
, rank() OVER (ORDER BY count(*) DESC) AS rnk
FROM casting c
JOIN movie m ON c.movieid = m.id
WHERE c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP BY m.yr
) sub
WHERE rnk = 1
ORDER BY yr; -- optionally sort by year
现在所有主要的 RDBMS 都支持窗口函数。
SELECT * from
(
SELECT yr as YEAR, COUNT(title) as TCOUNT
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr
order by TCOUNT desc
) res
where rownum < 2
它来自此站点 - http://sqlzoo.net/3.htm 2 种可能的解决方案:
TOP 1 A ORDER BY ... DESC:
SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id=actorid
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY yr
HAVING count(title)=(SELECT TOP 1 COUNT(title)
FROM casting
JOIN movie ON movieid=movie.id
JOIN actor ON actor.id=actorid
WHERE name='John Travolta'
GROUP BY yr
ORDER BY count(title) desc)
最大:
SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id=actorid
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY yr
HAVING
count(title)=
(SELECT MAX(A.CNT)
FROM (SELECT COUNT(title) AS CNT FROM actor
JOIN casting ON actor.id=actorid
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY (yr)) AS A)
使用 max 和限制只会给你第一行,但如果有两行或更多行的最大电影数量相同,那么你会错过一些数据。 如果您有可用的rank()函数,下面是一种方法。
SELECT
total_final.yr,
total_final.num_movies
FROM
( SELECT
total.yr,
total.num_movies,
RANK() OVER (ORDER BY num_movies desc) rnk
FROM (
SELECT
m.yr,
COUNT(*) AS num_movies
FROM MOVIE m
JOIN CASTING c ON c.movieid = m.id
JOIN ACTOR a ON a.id = c.actorid
WHERE a.name = 'John Travolta'
GROUP BY m.yr
) AS total
) AS total_final
WHERE rnk = 1
下面的代码给你答案。 它本质上是通过使用 ALL 来实现 MAX(COUNT(*)) 的。 它的优点是使用非常基本的命令和操作。
SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr HAVING COUNT(title) >= ALL
(SELECT COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr)
感谢最后的回答
SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr HAVING COUNT(title) >= ALL
(SELECT COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr)
我遇到了同样的问题:我只需要知道它们的计数与最大计数相匹配的记录(可能是一条或多条记录)。
我必须了解更多关于“ALL 子句”的知识,这正是我正在寻找的那种简单的解决方案。
根据您使用的数据库...
select yr, count(*) num from ...
order by num desc
我的大部分经验是在 Sybase 中,它使用一些与其他 DB 不同的语法。 但在这种情况下,您正在命名计数列,以便您可以按降序对其进行排序。 您可以更进一步,将结果限制在前 10 行(找出他最忙的 10 年)。
select top 1 yr,count(*) from movie
join casting on casting.movieid=movie.id
join actor on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr order by 2 desc
create view sal as
select yr,count(*) as ct from
(select title,yr from movie m, actor a, casting c
where a.name='JOHN'
and a.id=c.actorid
and c.movieid=m.id)group by yr
-----创建视图-----
select yr from sal
where ct =(select max(ct) from sal)
2013年
您可以将top
with ties
一起使用,这将包括具有最大count(*)
值的所有年份,如下所示:
select top (1) with ties yr, count(*)
from movie
join casting
on casting.movieid=movie.id
join actor
on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr;
order by count(*) desc
如果最大值为 6,您将获得计数值为 6 的所有年份。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.