簡體   English   中英

SQL:檢索恰好屬於兩組的用戶

[英]SQL: Retrieve users who are in exactly two groups

我的表如下所示:

用戶身份 團體 可能性
123 組1 0.9
123 組2 0.6
45 組2 0.8
567 組2 0.56
567 組3 0.78
567 組1 0.90

我需要提取組 1 和組 2 中的用戶,這意味着我只需要檢索用戶 123。我編寫了如下查詢:

with two_groups as (
select user_id
from table1
where group in ('group1', 'group2')
group by 1
having max(group) <> min(group) and count(user_id) = 2
)
select *
from two_groups
join table1 using (user_id)

我將它加入回 table1 的原因是因為我無法將組和概率列添加為“two_groups”子查詢中的字段,因為我不想按它們分組。

所以,問題出在我的查詢上,它仍然檢索用戶 ID 567。但是,我不希望它被提取,因為它也在 group3 中。 我可以做些什么來提取恰好屬於兩組的用戶?

謝謝!

您應該嘗試使用適當的內部連接

select *
from 
   two_groups t1 inner join 
   table1 t2 on t1.user_id = t2.user_id

也許:

with two_groups as (
  select user_id
  from table1
  group by 1
  having min(group) = 'group1' and
         max(group) = 'group2' and
         count(distinct group) = 2
)
select *
from two_groups
join table1 using (user_id)

但是,此方法不會擴展到“恰好三個組的成員”。

還:

select *
from   table1 t1
where  exists (select 1 from table1 t2 where t2.user_id = t1.user_id and t2.group = 'group1')
and    exists (select 1 from table1 t2 where t2.user_id = t1.user_id and t2.group = 'group2')
and    not exists (select 1 from table1 t2 where t2.user_id = t1.user_id and t2.group not in ('group1', 'group2'))

你可以加入你擁有的 select count( ),user_id 來自 table group by user_id having count( ) = 2。這會給你那些恰好在 2 組中的人。

這行得通嗎? 第一個查詢是您的原始 CTE,應該獲取在兩個組中都有一行的所有用戶第二個查詢刪除了在 1 或 2 以外的組中也有一行的所有用戶。

select user_id
from table1
where group in ('group1', 'group2')
group by user_id
having max(group) <> min(group)
    and count(user_id) = 2

EXCEPT DISTINCT

select user_id
from table1
where group not in ('group1', 'group2')

考慮以下(BigQuery)

select *
from your_table
where true 
qualify 2 = countif(`group` in ('group1', 'group2')) over(partition by user_id) 
and 0 = countif(not `group` in ('group1', 'group2')) over(partition by user_id)          

如果應用於您問題中的示例數據 - output 是

在此處輸入圖像描述

可以考慮以下

WITH TEMP_USER AS
(
    SELECT "123" AS USER_ID,"group1" AS GROUP_NAME, 0.9 AS PROBABILITY UNION ALL
    SELECT "123" AS USER_ID,"group2" AS GROUP_NAME, 0.6 AS PROBABILITY UNION ALL
    SELECT "45" AS USER_ID,"group2" AS GROUP_NAME, 0.8 AS PROBABILITY UNION ALL
    SELECT "567" AS USER_ID,"group2" AS GROUP_NAME, 0.56 AS PROBABILITY UNION ALL
    SELECT "567" AS USER_ID,"group3" AS GROUP_NAME, 0.78 AS PROBABILITY UNION ALL
    SELECT "567" AS USER_ID,"group1" AS GROUP_NAME, 0.90 AS PROBABILITY 

)

SELECT * FROM TEMP_USER U
WHERE EXISTS 
(
SELECT * FROM
    (
        SELECT USER_ID,STRING_AGG(DISTINCT GROUP_NAME ORDER BY GROUP_NAME) AS NAME 
        FROM TEMP_USER 
        GROUP BY USER_ID
        HAVING NAME="group1,group2"
    ) G 
WHERE U.USER_ID=G.USER_ID
);

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM