简体   繁体   English

我可以在 SQL 中做 max(count(*)) 吗?

[英]Can I do a max(count(*)) in SQL?

Here's my code:这是我的代码:

select yr,count(*)
from movie
join casting on casting.movieid=movie.id
join actor on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr;

Here's the question:这是问题:

Which were the busiest years for 'John Travolta'.那是“约翰特拉沃尔塔”最忙碌的几年。 Show the number of movies he made for each year.显示他每年制作的电影数量。

Here's the table structure:这是表结构:

movie(id, title, yr, score, votes, director)
actor(id, name)
casting(movieid, actorid, ord)

This is the output I am getting:这是我得到的 output:

yr      count(*)
1976    1
1977    1
1978    1
1981    1
1994    1
-- etc.

I need to get the rows for which count(*) is max.我需要获取count(*)最大的行。 How do I do this?我该怎么做呢?

Use:用:

  SELECT m.yr, 
         COUNT(*) AS num_movies
    FROM MOVIE m
    JOIN CASTING c ON c.movieid = m.id
    JOIN ACTOR a ON a.id = c.actorid
                AND a.name = 'John Travolta'
GROUP BY m.yr
ORDER BY num_movies DESC, m.yr DESC

Ordering by num_movies DESC will put the highest values at the top of the resultset.num_movies DESC排序会将最高值放在结果集的顶部。 If numerous years have the same count, the m.yr will place the most recent year at the top... until the next num_movies value changes.如果许多年的计数相同,则m.yr会将最近的一年放在顶部......直到下一个num_movies值发生变化。

Can I use a MAX(COUNT(*)) ?我可以使用 MAX(COUNT(*)) 吗?


No, you can not layer aggregate functions on top of one another in the same SELECT clause.不,您不能在同一个 SELECT 子句中将聚合函数相互叠加。 The inner aggregate would have to be performed in a subquery.内部聚合必须在子查询中执行。 IE: IE:

SELECT MAX(y.num)
  FROM (SELECT COUNT(*) AS num
          FROM TABLE x) y

只需按count(*) desc排序,您将获得最高值(如果将其与limit 1结合使用)

This question is old, but was referenced in a new question on dba.SE .这个问题很旧,但在 dba.SE 上的一个新问题中引用 I feel the best solutions haven't been provided.我觉得还没有提供最好的解决方案。 Plus, there are new, faster options.此外,还有新的、更快的选择。

Question in the title标题中的问题

Can I do a max(count(*)) in SQL?我可以在 SQL 中做一个max(count(*))吗?

Yes , you can achieve that by nesting an aggregate function in a window function :是的,您可以通过在 窗口函数中嵌套聚合函数来实现:

SELECT m.yr, count(*) AS movie_count
     , max(count(*)) OVER () AS max_ct
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC;

db<>fiddle here db<> 在这里摆弄

That's standard SQL .那是标准的 SQL Postgres introduced it with version 8.4 (released 2009-07-01, before this question was asked. Other RDBMS should be capable of the same. Consider the sequence of events in a SELECT query: Postgres 在 8.4 版中引入了它(发布于 2009-07-01,在提出这个问题之前。其他 RDBMS 应该能够做到这一点。考虑SELECT查询中的事件序列:

Possible downside: window functions do not aggregate rows.可能的缺点:窗口函数不聚合行。 You get all rows left after the aggregate step.在聚合步骤之后,您将获得所有行。 Useful in some queries, but not ideal for this one.在某些查询中很有用,但不适合这个查询。

To get one row with the highest count, you can use ORDER BY ct LIMIT 1 :要获得计数最高的一行,您可以使用ORDER BY ct LIMIT 1

SELECT c.yr, count(*) AS ct
FROM   actor   a
JOIN   casting c ON c.actorid = a.id
WHERE  a.name = 'John Travolta'
GROUP  BY c.yr
ORDER  BY ct DESC
LIMIT  1;

Using only basic SQL features, available in any halfway decent RDBMS - the LIMIT implementation varies:仅使用基本 SQL 功能,可在任何半体面的 RDBMS 中使用 - LIMIT实现各不相同:

Or you can get one row per group with the highest count with DISTINCT ON (only Postgres):或者您可以使用DISTINCT ON获得每组最高计数的一行(仅限 Postgres):

Actual Question实际问题

I need to get the rows for which count(*) is max.我需要获取count(*)最大的行。

There may be more than one row with the highest count.计数最高的行可能不止一行。

SQL Server has had the feature WITH TIES for some time - with non-standard syntax: SQL Server具有WITH TIES功能已有一段时间了 - 使用非标准语法:

SELECT TOP 1 WITH TIES
       m.yr, count(*) AS movie_count
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC;  -- can't sort by year for this

db<>fiddle here db<> 在这里摆弄

PostgreSQL 13 added WITH TIES with standard SQL syntax: PostgreSQL 13 WITH TIES标准 SQL 语法添加了WITH TIES

SELECT m.yr, count(*) AS movie_count
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC  -- can't sort by year for this
FETCH  FIRST 1 ROWS WITH TIES;

db<>fiddle here db<> 在这里摆弄

This should be the fastest possible query.这应该是最快的查询。 Further reading:进一步阅读:

To sort results by additional criteria (or for older versions of Postgres or other RDBMS without WITH TIES ), use the window function rank() in a subquery:要按附加条件对结果进行排序(或对于较旧版本的 Postgres 或其他没有WITH TIES RDBMS),请在子查询中使用窗口函数rank()

SELECT yr, movie_count
FROM  (
   SELECT m.yr, count(*) AS movie_count
        , rank() OVER (ORDER BY count(*) DESC) AS rnk
   FROM   casting c
   JOIN   movie   m ON c.movieid = m.id
   WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
   GROUP  BY m.yr
   ) sub
WHERE  rnk = 1
ORDER  BY yr;  -- optionally sort by year

All major RDBMS support window functions nowadays.现在所有主要的 RDBMS 都支持窗口函数。

SELECT * from 
(
SELECT yr as YEAR, COUNT(title) as TCOUNT
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr
order by TCOUNT desc
) res
where rownum < 2

it's from this site - http://sqlzoo.net/3.htm 2 possible solutions:它来自此站点 - http://sqlzoo.net/3.htm 2 种可能的解决方案:

with TOP 1 a ORDER BY ... DESC: TOP 1 A ORDER BY ... DESC:

SELECT yr, COUNT(title) 
FROM actor 
JOIN casting ON actor.id=actorid
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY yr
HAVING count(title)=(SELECT TOP 1 COUNT(title) 
FROM casting 
JOIN movie ON movieid=movie.id 
JOIN actor ON actor.id=actorid
WHERE name='John Travolta'
GROUP BY yr
ORDER BY count(title) desc)

with MAX:最大:

SELECT yr, COUNT(title) 
FROM actor  
JOIN casting ON actor.id=actorid    
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY yr
HAVING 
    count(title)=
        (SELECT MAX(A.CNT) 
            FROM (SELECT COUNT(title) AS CNT FROM actor 
                JOIN casting ON actor.id=actorid
                JOIN movie ON movie.id=movieid
                    WHERE name = 'John Travolta'
                    GROUP BY (yr)) AS A)

Using max with a limit will only give you the first row, but if there are two or more rows with the same number of maximum movies, then you are going to miss some data.使用 max 和限制只会给你第一行,但如果有两行或更多行的最大电影数量相同,那么你会错过一些数据。 Below is a way to do it if you have the rank() function available.如果您有可用的rank()函数,下面是一种方法。

SELECT
    total_final.yr,
    total_final.num_movies
    FROM
    ( SELECT 
        total.yr, 
        total.num_movies, 
        RANK() OVER (ORDER BY num_movies desc) rnk
        FROM (
               SELECT 
                      m.yr, 
                      COUNT(*) AS num_movies
               FROM MOVIE m
               JOIN CASTING c ON c.movieid = m.id
               JOIN ACTOR a ON a.id = c.actorid
               WHERE a.name = 'John Travolta'
               GROUP BY m.yr
             ) AS total
    ) AS total_final 
   WHERE rnk = 1

The following code gives you the answer.下面的代码给你答案。 It essentially implements MAX(COUNT(*)) by using ALL.它本质上是通过使用 ALL 来实现 MAX(COUNT(*)) 的。 It has the advantage that it uses very basic commands and operations.它的优点是使用非常基本的命令和操作。

SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr HAVING COUNT(title) >= ALL
  (SELECT COUNT(title)
   FROM actor
   JOIN casting ON actor.id = casting.actorid
   JOIN movie ON casting.movieid = movie.id
   WHERE name = 'John Travolta'
   GROUP BY yr)

Thanks to the last answer感谢最后的回答

SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr HAVING COUNT(title) >= ALL
  (SELECT COUNT(title)
   FROM actor
   JOIN casting ON actor.id = casting.actorid
   JOIN movie ON casting.movieid = movie.id
   WHERE name = 'John Travolta'
   GROUP BY yr)

I had the same problem: I needed to know just the records which their count match the maximus count (it could be one or several records).我遇到了同样的问题:我只需要知道它们的计数与最大计数相匹配的记录(可能是一条或多条记录)。

I have to learn more about "ALL clause", and this is exactly the kind of simple solution that I was looking for.我必须了解更多关于“ALL 子句”的知识,这正是我正在寻找的那种简单的解决方案。

Depending on which database you're using...根据您使用的数据库...

select yr, count(*) num from ...
order by num desc

Most of my experience is in Sybase, which uses some different syntax than other DBs.我的大部分经验是在 Sybase 中,它使用一些与其他 DB 不同的语法。 But in this case, you're naming your count column, so you can sort it, descending order.但在这种情况下,您正在命名计数列,以便您可以按降序对其进行排序。 You can go a step further, and restrict your results to the first 10 rows (to find his 10 busiest years).您可以更进一步,将结果限制在前 10 行(找出他最忙的 10 年)。

     select top 1 yr,count(*)  from movie
join casting on casting.movieid=movie.id
join actor on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr order by 2 desc
create view sal as
select yr,count(*) as ct from
(select title,yr from movie m, actor a, casting c
where a.name='JOHN'
and a.id=c.actorid
and c.movieid=m.id)group by yr

-----VIEW CREATED----- -----创建视图-----

select yr from sal
where ct =(select max(ct) from sal)

YR 2013 2013年

you can use the top along with with ties , which will include all of the years having the maximum count(*) value, something like this:您可以将top with ties一起使用,这将包括具有最大count(*)值的所有年份,如下所示:

select top (1) with ties yr, count(*)
from movie
   join casting 
      on casting.movieid=movie.id
   join actor 
      on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr;
order by count(*) desc

If the maximum is say 6, you'll get all of the years for which the count value is 6.如果最大值为 6,您将获得计数值为 6 的所有年份。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM