简体   繁体   中英

PSQL select all rows with a non-unique column

The query is supposed to query the item table and:

  • filter out active=0 items
  • select id and groupId where there's at least one more item with that groupId

Example:

| id  | groupId | active |
| --- | ------- | ------ |
| 1   | 1       | 1      |
| 2   | 2       | 1      |
| 3   | 2       | 0      |
| 4   | 3       | 1      |
| 5   | 3       | 1      |
| 6   | 4       | 1      |

Desired Output:

| id  | groupId |
| --- | ------- |
| 4   | 3       |
| 5   | 3       |

Explanation

  • groupID 1: invalid because has only 1 member
  • groupID 2: invalid because has two members, but one is inactive
  • groupID 3: valid
  • groupID 4: invalid because has only 1 member

What I tried

SELECT id, groupId
FROM items
WHERE id IN (
    SELECT id 
    FROM items
    WHERE active=1
    GROUP BY groupId
    HAVING COUNT(*) > 1
  );

But I get the id must appear in the GROUP BY clause or be used in an aggregate function error. I understand I can mess around with the sql_mode to get rid of that error, but I would rather avoid that.

Go for window functions:

select i.*
from (select i.*, count(*) over (partition by groupid) as cnt
      from items i
      where active = 1
     ) i
where cnt > 1

Window functions is the way to go.

But if you want to fix your query then this should do it:

select a.id, a.groupId from items a
where active = 1 and groupid in(
    select groupId from item 
    where active = 1
    group by groupId
    having count(distinct id) > 1
)

because we are counting which groupid has more than 1 id for the same groupid

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM