简体   繁体   中英

How to apply group by here?

I have a table Movie with columns Movie and Viewer where each movie is viewed by any user any number of times, so the table can have multiple same entries. I want to find the Top N most viewed movies and then the Top K viewers for each of the Top N movies. How can I apply group by or partition by effectively in such scenario? Or if there is any better approach to this, please share. Thanks!

Movie User
Avengers John
Batman Chris
Batman Ron
X-Men Chris
X-Men Ron
Matrix John
Batman Martin
Matrix Chris
Batman Chris
X-Men Ron

So, in this table Batman is the most watched movie is Batman followed by X-Men so I want the result table to look like:

Movie User View count
Batman Chris 2
Batman Ron 1
Batman Martin 1
X-Men Ron 2
X-Men Chris 1
Matrix John 1
Matrix Chris 1
Avengers John 1

I understand that I can group by movie and then do order by count(*) desc but this doesn't give me the second column which is grouped by viewer and the count for each viewer also.

Consider below approach (assuming Top 3 movies with Top 2 users)

select movie, user, view_count
from (
  select distinct *,  
    count(*) over(partition by movie) movie_views,
    count(*) over(partition by movie, user) view_count
  from your_table
)
qualify dense_rank() over(order by movie_views desc) <=3 
and row_number() over(partition by movie order by view_count desc) <=2
-- order by movie_views desc, view_count desc     

if applied to sample data in your question - output is

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM