GROUP BY on one column with multiple columns in SELECT

Question

I have a relation of movies with the following attributes: movie_id, title, director, genres, year, revenue_generated, short_description, runtime, and rating.

I want to find the movies from each genre that have the highest revenue. I would like to return the movie_id, title, genre, and revenue of the movie from each genre that had the highest revenue.

I've tried to the following query:

SELECT movie_id, title, genres, MAX(revenue_generated) AS highest_revenue
FROM movies
GROUP BY genres;

But get the following error:

ERROR:  column "movies.movie_id" must appear in the GROUP BY clause or be used in an aggregate function

I know this isn't working because it doesn't know how to "group" the movie_id and title by each category. If I were to do the following query:

SELECT genres, MAX(revenue_generated) AS highest_revenue
FROM movies
GROUP BY genres;

how would I also get the movie_id and title for these observations in the output? Here is a screenshot of some of the data being used:

Answer 1

I want to find the movies from each genre that have the highest revenue

In SQL Server, and several other platforms, you would use ROW_NUMBER() here, eg:

with q as
(
  SELECT movie_id, 
         title, 
         genre, 
         revenue_generated, row_number() over (partition by genre order by revenue_generated desc) rn
  FROM movies
)
select movie_id, 
       title, 
       genre, 
       revenue_generated
from q
where rn = 1

GROUP BY on one column with multiple columns in SELECT

Question

1 answers

solution1
2 2022-10-04 01:39:06

GROUP BY on one column with multiple columns in SELECT

Question

1 answers

solution1 2 2022-10-04 01:39:06

solution1
2 2022-10-04 01:39:06