简体   繁体   English

当按其他列分组时,如何在特定列中具有非唯一值的 select 行?

[英]How to select rows with non-unique values in a specific column when grouped by other columns?

I have a table tbl like this:我有一个tbl这样的表:

| id | grp | pid | oid |
| -- | --- | --- | --- |
| 1  | 1   | 1   | 1   |
| 2  | 2   | 2   | 1   |
| 3  | 3   | 1   | 1   |
| 4  | 3   | 2   | 1   |
| 5  | 4   | 1   | 1   |
| 6  | 1   | 1   | 2   |
| 7  | 2   | 2   | 2   |
| 8  | 3   | 1   | 2   |
| 9  | 4   | 1   | 2   |

I am trying to write a PostgreSQL query which selects the rows where for a given GRP in a given OID, PID has a distinct count greater than 1. For the above, since PID has two distinct values (1 and 2) for GRP 3 in OID 1, it should return:我正在尝试编写一个 PostgreSQL 查询,该查询选择给定 OID 中给定 GRP 的行,其中 PID 具有大于 1 的不同计数。对于上述情况,由于 PID 对 GRP 3 有两个不同的值(1 和 2) OID 1,它应该返回:

| id | grp | pid | oid |
| -- | --- | --- | --- |
| 3  | 3   | 1   | 1   |
| 4  | 3   | 2   | 1   |

I have a solution for this using Python + Pandas, though this is less than ideal:我有一个使用 Python + Pandas 的解决方案,尽管这不太理想:

import pandas as pd
rows = pd.read_sql("SELECT * FROM tbl", db.engine)
output = pd.DataFrame(columns = rows.columns)
oids = rows['oid'].tolist()
for oid in oids:
   oid_rows = rows[rows['oid'] == oid]
   grps = oid_rows['grp'].tolist()
   for grp in grps:
       grp_rows = oid_rows[oid_rows['grp'] == grp]
       if len(grp_rows) > 1:
           output = pd.concat([output, grp_rows],axis=0)
print(output)

I'd prefer to do this purely in SQL, essentially a query along the lines of:我宁愿纯粹在 SQL 中执行此操作,本质上是以下查询:

SELECT * FROM tbl HAVING COUNT(pid) > 1 IN
    (SELECT * FROM tbl GROUP BY grp, oid)

How do I write this query?如何编写此查询?

You can use exists :您可以使用exists

select t.*
from tbl t
where exists (select 1
              from tbl t2
              where t2.grp = t.grp and t2.oid = t.oid and
                    t2.id <> t.id
             );

You can also use window functions -- although this may be less efficient:您还可以使用 window 函数——尽管这可能效率较低:

select t.*
from (select t.*, count(*) over (partition by grp, oid) as cnt
      from tbl t
     ) t
where cnt >= 2;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM