简体   繁体   中英

max of each group in subquery that matches a condition

I have a table as shown below.

I have a table with 10 columns and I am interested in 4 of those. Say tableA with id, name, url, ranking.

id    |name    |url    |ranking
--------------------------------
1     |apple   |a1.com  |1
2     |apple   |a1.com  |2
3     |apple   |a1z.com |3
4     |orange  |o1.com  |1
5     |orange  |o1.com  |2
6     |apple   |a1.com  |4
7     |apple   |a1z.com |5
8     |orange  |o1z.com |6

I want rows with id 7,6,3,2 8,5,4 . ie For each group (apple and orange) - all rows with ranking > max(ranking)-3 and where url has z in it.

For apple, id 7 , max ranking with url that has z in it is 5

So I want apple rows with ranking >5-3 ie. ranking greater than 2.

Which is rows with id 7,6,3.

Similarly for orange group. (rows with id 8,5,4)

Hmmm. You seem to want at most four records from each group, ordered by ranking:

select t.*
from (select t.*,
             row_number() over (partition by name order by ranking desc) as seqnum
      from t
     ) t
where seqnum <= 4
order by name, ranking desc;

Oops, I just remembered. Amazon Redshift doesn't support row_number() (or has this been fixed?). A cumulative count works:

select t.*
from (select t.*,
             count(*) over (partition by name order by ranking desc range between unbounded preceding and current row) as seqnum
      from t
     ) t
where seqnum <= 4
order by name, ranking desc;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM