简体   繁体   English

SQL帮助按组和计数查找唯一对

[英]SQL help to find unique pairs by group and count

Need some help with SQL, possibly using group and count, or whatever it needs. 需要一些有关SQL的帮助,可能需要使用分组和计数或其他所需的方法。 Just could not find a way. 只是找不到办法。 Thanks a lot. 非常感谢。

A simple table: 一个简单的表:

ColA   ColB
  1      A
  1      A
  2      B
  3      B
  4      C
  4      C
  5      C

Return all unique pairs of ColA and ColB, where for the same ColB there are more than one distinct ColA values. 返回所有唯一的ColA和ColB对,其中对于同一个ColB,有多个不同的ColA值。

For given data above, it shall return 对于上面给定的数据,它将返回

ColA  ColB
  2     B
  3     B
  4     C
  5     C

EDIT: Apologies about the first answer, I hadn't noticed your stipulation for ColA to have more than one distinct value across ColB . 编辑:对第一个答案表示歉意,我没有注意到您对ColA规定ColA在整个ColB具有多个独特价值的ColB Here's my updated answer: 这是我的最新答案:

SELECT ColA, ColB
FROM test
WHERE EXISTS (
    SELECT ColB
    FROM test AS subtest
    WHERE test.ColB = subtest.ColB
    GROUP BY ColB
    HAVING COUNT(DISTINCT ColA) > 1
)
GROUP BY ColA, ColB

You can use APPLY to check if ColB has more than 1 disinct ColA values: 您可以使用APPLY来检查ColB是否具有多个不相同的ColA值:

WITH Cte(ColA, ColB) AS(
    SELECT * FROM( VALUES
        (1, 'A'), (1, 'A'), (2, 'B'), (3, 'B'), (4, 'C'), (4, 'C'), (5, 'C')
    ) t(a,b)
)
SELECT DISTINCT c1.*
FROM Cte c1
CROSS APPLY(
    SELECT COUNT(*) AS cnt
    FROM Cte c2
    WHERE
        c2.ColB = c1.ColB
        AND c2.ColA <> c1.ColA
    GROUP BY c2.ColB
    HAVING COUNT(*) > 0
) x

ONLINE DEMO 在线演示

First using Group By Clause and Row_Number() I will get a number sequenced result set. 首先使用Group By Clause和Row_Number(),我将获得一个数字排序的结果集。 This way I can identify unique pairs of ColumnA and ColumnB where for the same ColumnB there is more than one distinct value of ColumnA. 这样,我可以识别ColumnA和ColumnB的唯一对,对于同一个ColumnB,ColumnA有多个不同的值。

Select
    ColumnA,
    ColumnB,
    Row_Number() Over(Partition By ColumnB Order By ColumnA) As RowNum
From SimpleTable
Group By ColumnB, ColumnA;

Output: 输出:

ColumnA ColumnB RowNum
1       A       1
2       B       1
3       B       2
4       C       1
5       C       2

Now you can put this result in a table expression - CTE or a derived table (I'm choosing CTE) and filter out only those ColumnB values for which RowNum is greater than or equal to 2. So the final query will be -- 现在,您可以将此结果放入表表达式中-CTE或派生表(我选择CTE),并仅过滤出RowNum大于或等于2的那些ColumnB值。因此,最终查询为-

;With CTE
As
(
    Select
        ColumnA,
        ColumnB,
        Row_Number() Over(Partition By ColumnB Order By ColumnA) As RowNum
    From SimpleTable
    Group By ColumnB, ColumnA
)
Select ColumnA, ColumnB From CTE
Where ColumnB In (Select ColumnB From CTE Where RowNum >=2)
Order By ColumnA, ColumnB;

Final output: 最终输出:

ColumnA ColumnB
2       B
3       B
4       C
5       C

Hope this is helpful :) 希望这会有所帮助:)

I would suggest a simple approach using min() and max() as window functions: 我建议使用min()max()作为窗口函数的简单方法:

select colA, colB
from (select t.*,
             min(colA) over (partition by colB) as mincolA,
             max(colA) over (partition by colB) as maxcolA
      from t
     ) t
where mincolA <> maxcolA;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM