繁体   English   中英

为A列的每个值选择N列B的最频繁值

[英]Select N most frequent values of column B for each value of column A

用如下的MySQL表:

id | colA | colB
...| 1    | 13
...| 1    | 13
...| 1    | 12
...| 1    | 12
...| 1    | 11
...| 2    | 78
...| 2    | 78
...| 2    | 78
...| 2    | 13
...| 2    | 13
...| 2    | 9

对于每个值colA我想找到N个最频繁的值colB

N = 2的示例结果:

colA | colB
1    | 13
1    | 12
2    | 78
2    | 13

我能够使用以下方法获得colAcolB所有唯一组合及其频率:

SELECT colA, colB, COUNT(*) AS freq FROM t GROUP BY colA, colB ORDER BY freq DESC;

结果示例:

colA | colB | freq
1    | 13   | 2
1    | 12   | 2
1    | 11   | 1
2    | 78   | 3
2    | 13   | 2
2    | 9    | 1

但是我很难为colA每个值而不是整个表应用LIMIT

基本上就像如何在每个ID组的列中选择最频繁的值? ,仅适用于MySQL而非PostgreSQL。

我目前正在使用MariaDB 10.1。

如果可以,请使用窗口功能:

SELECT colA, colB, freq
FROM (SELECT colA, colB, COUNT(*) AS freq,
             DENSE_RANK() OVER (PARTITION BY colA ORDER BY COUNT(*) DESC) as seqnum
      FROM t
      GROUP BY colA, colB 
     ) ab
WHERE seqnum <= 2;

请注意,根据您对待领带的方式,可能需要DENSE_RANK()RANK()ROW_NUMBER() 如果有5个colB值具有最高的两个等级,则DENSE_RANK()将返回所有五个。

如果只需要两个值,则使用ROW_NUMBER()

您可能可以为此使用几个CTE,例如:

WITH counts AS (
   SELECT colA, colB, COUNT(*) AS freq FROM t GROUP BY colA, colB ORDER BY freq DESC
), most_freq AS (
   SELECT colA, max(freq) FROM counts GROUP BY colA
)
   SELECT counts.*
     FROM counts
     JOIN most_freq ON (counts.colA = most_freq.colA 
                        AND counts.freq = most_freq.freq);

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM