[英]Get most frequent value per group
I have a table (DeviceOS2) and would like to get the most frequent value for each column (OS and Device) per ID. 我有一个表(DeviceOS2),并希望获得每个ID的每个列(操作系统和设备)最频繁的值。
ID OS Device
123 OSX Mac
123 OSX PC
123 OSX PC
123 Android Tablet
Desired result: 期望的结果:
ID OS Device
123 OSX PC
However, my code now gets me the following: 但是,我的代码现在给我以下内容:
ID OS Device
123 Android Tablet
123 OSX Mac
123 OSX PC
Looks like it picks up every combination. 看起来它捡起了每一个组合。
Current code (T-SQL): 当前代码(T-SQL):
Select
ID,
OS,
Device
FROM(
Select
ID,
OS,
Device
FROM DeviceOS2
Group By ID,OS,Device) a
Group By ID,OS,Device
You could use: 你可以使用:
SELECT TOP 1 WITH TIES *
FROM tab
ORDER BY COUNT(*) OVER(PARITIION BY ID,OS) DESC
This is called the mode . 这称为模式 。 You can use window functions:
您可以使用窗口功能:
select o.*
from (select os, device, count(*) as cnt,
row_number() over (partition by os order by count(*) desc) as seqnum
from DeviceOS2
group by os, device
) o
where seqnum = 1;
If you want the most frequent combination, then use: 如果您想要最频繁的组合,那么使用:
select os, device, count(*) as cnt
from DeviceOS2
group by os, device
order by count(*) desc
fetch first 1 row only;
(or use select top (1)
if you prefer). (或者如果您愿意,可以使用
select top (1)
)。
EDIT: 编辑:
For your edited question: 对于您编辑的问题:
select o.*
from (select os, device, count(*) as cnt,
row_number() over (partition by os order by count(*) desc) as seqnum
from DeviceOS2
group by os, device
) o
where seqnum = 1;
If you want the most frequent combination, then the query is a bit more complicated. 如果您想要最频繁的组合,那么查询会更复杂一些。 One method is two aggregations:
一种方法是两种聚合:
select o.id,
max(case case when o.seqnum = 1 then os end) as most_frequent_os,
max(case case when d.seqnum = 1 then device end) as most_frequent_device
from (select id, os, count(*) as cnt,
row_number() over (partition by id order by count(*) desc) as seqnum
from DeviceOS2
group by id, os
) o join
(select id, device, count(*) as cnt,
row_number() over (partition by id order by count(*) desc) as seqnum
from DeviceOS2
group by id, device
) d
on d.id = o.id
Try this: 尝试这个:
select top 1 with ties a.ID, a.OS,a.Device
from (
select d.ID, d.OS, d.Device, ROW_NUMBER () over (partition by d.OS, d.Device order by id) rnk
from DeviceOS2 d)a
order by a.rnk desc
Update 更新
If you need the most frequent one for each ID: 如果您需要每个ID最常用的一个:
select c.ID,c.OS,c.Device from (
select d.ID, d.OS, d.Device, ROW_NUMBER () over (partition by d.id, d.OS, d.Device order by id) rnk
from DeviceOS2 d)c
join
(
select a.ID,max(a.rnk) AS rnk
from (
select d.ID, d.OS, d.Device, ROW_NUMBER () over (partition by d.id, d.OS, d.Device order by id) rnk
from DeviceOS2 d)a
group by a.ID) a
on c.ID = a.ID and a.rnk = c.rnk
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.