简体   繁体   English

获得每组最常见的价值

[英]Get most frequent value per group

I have a table (DeviceOS2) and would like to get the most frequent value for each column (OS and Device) per ID. 我有一个表(DeviceOS2),并希望获得每个ID的每个列(操作系统和设备)最频繁的值。

ID      OS      Device

123     OSX     Mac 
123     OSX     PC  
123     OSX     PC  
123     Android Tablet

Desired result: 期望的结果:

ID      OS      Device

123     OSX     PC  

However, my code now gets me the following: 但是,我的代码现在给我以下内容:

ID       OS            Device

123      Android       Tablet
123      OSX           Mac
123      OSX           PC

Looks like it picks up every combination. 看起来它捡起了每一个组合。

Current code (T-SQL): 当前代码(T-SQL):

Select 
ID,
OS,
Device

FROM(
Select 
ID,
OS,
Device
FROM DeviceOS2
Group By ID,OS,Device) a 
Group By ID,OS,Device

You could use: 你可以使用:

SELECT TOP 1 WITH TIES *
FROM tab
ORDER BY COUNT(*) OVER(PARITIION BY ID,OS) DESC

This is called the mode . 这称为模式 You can use window functions: 您可以使用窗口功能:

select o.*
from (select os, device, count(*) as cnt,
             row_number() over (partition by os order by count(*) desc) as seqnum
      from DeviceOS2
      group by os, device
     ) o
where seqnum = 1;

If you want the most frequent combination, then use: 如果您想要最频繁的组合,那么使用:

select os, device, count(*) as cnt
from DeviceOS2
group by os, device
order by count(*) desc
fetch first 1 row only;

(or use select top (1) if you prefer). (或者如果您愿意,可以使用select top (1) )。

EDIT: 编辑:

For your edited question: 对于您编辑的问题:

select o.*
from (select os, device, count(*) as cnt,
             row_number() over (partition by os order by count(*) desc) as seqnum
      from DeviceOS2
      group by os, device
     ) o
where seqnum = 1;

If you want the most frequent combination, then the query is a bit more complicated. 如果您想要最频繁的组合,那么查询会更复杂一些。 One method is two aggregations: 一种方法是两种聚合:

select o.id,
       max(case case when o.seqnum = 1 then os end) as most_frequent_os,
      max(case case when d.seqnum = 1 then device end) as most_frequent_device
from (select id, os, count(*) as cnt,
             row_number() over (partition by id order by count(*) desc) as seqnum
      from DeviceOS2
      group by id, os
     ) o join
     (select id, device, count(*) as cnt,
             row_number() over (partition by id order by count(*) desc) as seqnum
      from DeviceOS2
      group by id, device
     ) d
     on d.id = o.id

Try this: 尝试这个:

select top 1 with ties a.ID, a.OS,a.Device
from (
select d.ID, d.OS, d.Device, ROW_NUMBER () over (partition by d.OS, d.Device order by id) rnk
from DeviceOS2 d)a
order by a.rnk desc

Update 更新

If you need the most frequent one for each ID: 如果您需要每个ID最常用的一个:

select c.ID,c.OS,c.Device from (
select d.ID, d.OS, d.Device, ROW_NUMBER () over (partition by d.id, d.OS, d.Device order by id) rnk
from DeviceOS2 d)c
join 
(
select  a.ID,max(a.rnk) AS rnk
from (
select d.ID, d.OS, d.Device, ROW_NUMBER () over (partition by d.id, d.OS, d.Device order by id) rnk
from DeviceOS2 d)a
group by a.ID) a
on c.ID = a.ID and a.rnk = c.rnk

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM