简体   繁体   English

当必须根据条件对记录进行分组时,如何最多选择 x 行

[英]How to select up to x rows when records have to be grouped based on criteria

I want to create query where I select up to x rows, records are grouped by one id, and whole groups have to be result of the query.我想创建查询,我最多选择 x 行,记录按一个 id 分组,并且整个组必须是查询的结果。 Values I base filtering on are stored in p_id column, rows with the same value create groups.我基于过滤的值存储在 p_id 列中,具有相同值的行创建组。 In case of that table:在该表的情况下:

    p_id    age
0   00170   64  
1   00170   64  
2   00201   24  
3   00201   64  
4   00201   64  
5   00300   24  
6   00300   20  

I want to get select 4 rows, but because groups with p_id's 00170,00201 are total 5 records I get:我想选择 4 行,但是因为 p_id 为 00170,00201 的组总共有 5 条记录,所以我得到了:

0   00170   64 
1   00170   64

If I would select 5 rows i would get:如果我选择 5 行,我会得到:

0   00170   64 
1   00170   64 
2   00201   24 
3   00201   64 
4   00201   64 

If I would select 6 rows I would get (p_id 00300 is 2 records so not included as sum exceeds 6):如果我要选择 6 行,我会得到(p_id 00300 是 2 条记录,因此不包括在内,因为总和超过 6):

0   00170   64 
1   00170   64 
2   00201   24 
3   00201   64 
4   00201   64 

So whole groups are returned.所以整个组都返回了。 I'm working with oracle db, selecting x rows is easy with ROWNUM.我正在使用 oracle db,使用 ROWNUM 可以轻松选择 x 行。 I get lost when I try to get up to certain amount of rows with additional criteria.当我尝试使用附加条件达到一定数量的行时,我迷路了。

Oracle Setup :甲骨文设置

CREATE TABLE test_data ( p_id, age ) AS
SELECT '00170', 64 FROM DUAL UNION ALL
SELECT '00170', 64 FROM DUAL UNION ALL
SELECT '00201', 24 FROM DUAL UNION ALL
SELECT '00201', 64 FROM DUAL UNION ALL
SELECT '00201', 64 FROM DUAL UNION ALL
SELECT '00300', 24 FROM DUAL UNION ALL  
SELECT '00300', 20 FROM DUAL;

Query :查询

Order the rows then find the maximum row number for each group and then filter to only return the groups whose maximum row number is contained in the row limit you desire:对行进行排序,然后找到每个组的最大行号,然后过滤以仅返回最大行号包含在所需行限制中的组:

SELECT p_id,
       age
FROM   (
  SELECT t.*,
         MAX( ROWNUM ) OVER ( PARTITION BY p_id ) AS grp
  FROM   (
    SELECT *
    FROM   test_data
    ORDER BY p_id
  ) t
)
WHERE  grp <= 4;

Output :输出

\nP_ID | P_ID | AGE年龄\n:---- | :---- | --: ——:\n00170 | 00170 | 64 64\n00170 | 00170 | 64 64\n

If you change the last line to 5 then it will return 5 rows and change it to 6 then it will still return 5 rows.如果您将最后一行更改为 5,则它将返回 5 行并将其更改为 6,则它仍将返回 5 行。

db<>fiddle here db<> 在这里摆弄

I would address this with a window count and filtering:我会用窗口计数和过滤来解决这个问题:

select p_id, age
from (select p_id, age, count(*) over(order by p_id) cnt from mytable t) t
where cnt <= 5
order by p_id

You can change cnt <= 5 as needed.您可以根据需要更改cnt <= 5

Demo on DB Fiddle : DB Fiddle 上的演示

cnt <= 4 : cnt <= 4

P_ID | AGE
---: | --:
 170 |  64
 170 |  64

cnt <= 5 : cnt <= 5

P_ID | AGE
---: | --:
 170 |  64
 170 |  64
 201 |  24
 201 |  64
 201 |  64

cnt <= 6 : cnt <= 6

P_ID | AGE
---: | --:
 170 |  64
 170 |  64
 201 |  24
 201 |  64
 201 |  64

GMB's answer is fine. GMB 的回答很好。 But it can be simplified a wee bit by using RANK() .但是可以通过使用RANK()来简化它。 This function happens to do exactly what you want:此功能恰好做的正是你想要的东西:

select p_id, age
from (select t.*,
             rank() over (order by p_id) as rnk
      from t
     ) t
where rnk <= 5
order by p_id;

More importantly, though, if the p_id values are not ordered, then you might want an additional step: assign the minimum value of some ordering column to each p_id .更重要的是,如果p_id值没有排序,那么您可能需要一个额外的步骤:将某个排序列的最小值分配给每个p_id Let me call that ordering column id :让我称之为排序列id

select p_id, age
from (select t.*,
             rank() over (order by p_id_grp) as rnk
      from (select t.*, min(id) over (partition by p_id) as p_id_grp
            from t
           ) t
     ) t
where rnk <= 5
order by p_id;

This is a tipical Top-N query:这是一个典型的 Top-N 查询:

use ROWNUM with ordered view to get the ordering correct:将 ROWNUM 与有序视图一起使用以获得正确的排序:

SELECT p_id, age
FROM   (SELECT p_id, age
        FROM   table
        ORDER BY age DESC)
WHERE ROWNUM <= 4;

For Oracle v 12c onward there is the new FETCH clause:对于 Oracle v 12c 以后的版本,有新的 FETCH 子句:

SELECT p_id, age
FROM   table
GROUP BY p_id
FETCH FIRST 4 ROWS ONLY;

More resources: https://oracle-base.com/articles/misc/top-n-queries更多资源: https : //oracle-base.com/articles/misc/top-n-queries

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM