繁体   English   中英

我可以在 SQL 中做 max(count(*)) 吗?

[英]Can I do a max(count(*)) in SQL?

这是我的代码:

select yr,count(*)
from movie
join casting on casting.movieid=movie.id
join actor on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr;

这是问题:

那是“约翰特拉沃尔塔”最忙碌的几年。 显示他每年制作的电影数量。

这是表结构:

movie(id, title, yr, score, votes, director)
actor(id, name)
casting(movieid, actorid, ord)

这是我得到的 output:

yr      count(*)
1976    1
1977    1
1978    1
1981    1
1994    1
-- etc.

我需要获取count(*)最大的行。 我该怎么做呢?

用:

  SELECT m.yr, 
         COUNT(*) AS num_movies
    FROM MOVIE m
    JOIN CASTING c ON c.movieid = m.id
    JOIN ACTOR a ON a.id = c.actorid
                AND a.name = 'John Travolta'
GROUP BY m.yr
ORDER BY num_movies DESC, m.yr DESC

num_movies DESC排序会将最高值放在结果集的顶部。 如果许多年的计数相同,则m.yr会将最近的一年放在顶部......直到下一个num_movies值发生变化。

我可以使用 MAX(COUNT(*)) 吗?


不,您不能在同一个 SELECT 子句中将聚合函数相互叠加。 内部聚合必须在子查询中执行。 IE:

SELECT MAX(y.num)
  FROM (SELECT COUNT(*) AS num
          FROM TABLE x) y

只需按count(*) desc排序,您将获得最高值(如果将其与limit 1结合使用)

这个问题很旧,但在 dba.SE 上的一个新问题中引用 我觉得还没有提供最好的解决方案。 此外,还有新的、更快的选择。

标题中的问题

我可以在 SQL 中做一个max(count(*))吗?

是的,您可以通过在 窗口函数中嵌套聚合函数来实现:

SELECT m.yr, count(*) AS movie_count
     , max(count(*)) OVER () AS max_ct
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC;

db<> 在这里摆弄

那是标准的 SQL Postgres 在 8.4 版中引入了它(发布于 2009-07-01,在提出这个问题之前。其他 RDBMS 应该能够做到这一点。考虑SELECT查询中的事件序列:

可能的缺点:窗口函数不聚合行。 在聚合步骤之后,您将获得所有行。 在某些查询中很有用,但不适合这个查询。

要获得计数最高的一行,您可以使用ORDER BY ct LIMIT 1

SELECT c.yr, count(*) AS ct
FROM   actor   a
JOIN   casting c ON c.actorid = a.id
WHERE  a.name = 'John Travolta'
GROUP  BY c.yr
ORDER  BY ct DESC
LIMIT  1;

仅使用基本 SQL 功能,可在任何半体面的 RDBMS 中使用 - LIMIT实现各不相同:

或者您可以使用DISTINCT ON获得每组最高计数的一行(仅限 Postgres):

实际问题

我需要获取count(*)最大的行。

计数最高的行可能不止一行。

SQL Server具有WITH TIES功能已有一段时间了 - 使用非标准语法:

SELECT TOP 1 WITH TIES
       m.yr, count(*) AS movie_count
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC;  -- can't sort by year for this

db<> 在这里摆弄

PostgreSQL 13 WITH TIES标准 SQL 语法添加了WITH TIES

SELECT m.yr, count(*) AS movie_count
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC  -- can't sort by year for this
FETCH  FIRST 1 ROWS WITH TIES;

db<> 在这里摆弄

这应该是最快的查询。 进一步阅读:

要按附加条件对结果进行排序(或对于较旧版本的 Postgres 或其他没有WITH TIES RDBMS),请在子查询中使用窗口函数rank()

SELECT yr, movie_count
FROM  (
   SELECT m.yr, count(*) AS movie_count
        , rank() OVER (ORDER BY count(*) DESC) AS rnk
   FROM   casting c
   JOIN   movie   m ON c.movieid = m.id
   WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
   GROUP  BY m.yr
   ) sub
WHERE  rnk = 1
ORDER  BY yr;  -- optionally sort by year

现在所有主要的 RDBMS 都支持窗口函数。

SELECT * from 
(
SELECT yr as YEAR, COUNT(title) as TCOUNT
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr
order by TCOUNT desc
) res
where rownum < 2

它来自此站点 - http://sqlzoo.net/3.htm 2 种可能的解决方案:

TOP 1 A ORDER BY ... DESC:

SELECT yr, COUNT(title) 
FROM actor 
JOIN casting ON actor.id=actorid
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY yr
HAVING count(title)=(SELECT TOP 1 COUNT(title) 
FROM casting 
JOIN movie ON movieid=movie.id 
JOIN actor ON actor.id=actorid
WHERE name='John Travolta'
GROUP BY yr
ORDER BY count(title) desc)

最大:

SELECT yr, COUNT(title) 
FROM actor  
JOIN casting ON actor.id=actorid    
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY yr
HAVING 
    count(title)=
        (SELECT MAX(A.CNT) 
            FROM (SELECT COUNT(title) AS CNT FROM actor 
                JOIN casting ON actor.id=actorid
                JOIN movie ON movie.id=movieid
                    WHERE name = 'John Travolta'
                    GROUP BY (yr)) AS A)

使用 max 和限制只会给你第一行,但如果有两行或更多行的最大电影数量相同,那么你会错过一些数据。 如果您有可用的rank()函数,下面是一种方法。

SELECT
    total_final.yr,
    total_final.num_movies
    FROM
    ( SELECT 
        total.yr, 
        total.num_movies, 
        RANK() OVER (ORDER BY num_movies desc) rnk
        FROM (
               SELECT 
                      m.yr, 
                      COUNT(*) AS num_movies
               FROM MOVIE m
               JOIN CASTING c ON c.movieid = m.id
               JOIN ACTOR a ON a.id = c.actorid
               WHERE a.name = 'John Travolta'
               GROUP BY m.yr
             ) AS total
    ) AS total_final 
   WHERE rnk = 1

下面的代码给你答案。 它本质上是通过使用 ALL 来实现 MAX(COUNT(*)) 的。 它的优点是使用非常基本的命令和操作。

SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr HAVING COUNT(title) >= ALL
  (SELECT COUNT(title)
   FROM actor
   JOIN casting ON actor.id = casting.actorid
   JOIN movie ON casting.movieid = movie.id
   WHERE name = 'John Travolta'
   GROUP BY yr)

感谢最后的回答

SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr HAVING COUNT(title) >= ALL
  (SELECT COUNT(title)
   FROM actor
   JOIN casting ON actor.id = casting.actorid
   JOIN movie ON casting.movieid = movie.id
   WHERE name = 'John Travolta'
   GROUP BY yr)

我遇到了同样的问题:我只需要知道它们的计数与最大计数相匹配的记录(可能是一条或多条记录)。

我必须了解更多关于“ALL 子句”的知识,这正是我正在寻找的那种简单的解决方案。

根据您使用的数据库...

select yr, count(*) num from ...
order by num desc

我的大部分经验是在 Sybase 中,它使用一些与其他 DB 不同的语法。 但在这种情况下,您正在命名计数列,以便您可以按降序对其进行排序。 您可以更进一步,将结果限制在前 10 行(找出他最忙的 10 年)。

     select top 1 yr,count(*)  from movie
join casting on casting.movieid=movie.id
join actor on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr order by 2 desc
create view sal as
select yr,count(*) as ct from
(select title,yr from movie m, actor a, casting c
where a.name='JOHN'
and a.id=c.actorid
and c.movieid=m.id)group by yr

-----创建视图-----

select yr from sal
where ct =(select max(ct) from sal)

2013年

您可以将top with ties一起使用,这将包括具有最大count(*)值的所有年份,如下所示:

select top (1) with ties yr, count(*)
from movie
   join casting 
      on casting.movieid=movie.id
   join actor 
      on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr;
order by count(*) desc

如果最大值为 6,您将获得计数值为 6 的所有年份。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM