[英]Partitioning in SQL
I've been trying to find the median number of times an account (person) is seen (appt_id) by a provider (provider_code) in a given period. 我一直在尝试查找给定时间段内提供者(provider_code)看到某个帐户(人)的次数(appt_id)的中位数。 The attached SQL doesn't seen to be capturing all of the provider_codes and I can't figure out why.
所附的SQL不能捕获所有的provider_codes,我也不知道为什么。 Desired outcome is that all provider_code are listed with a median number.
期望的结果是所有提供者代码都用中位数列出。
*I don't have access to MS SQL Server 2012 or Newer - yes we are way behind the times and yes it does make life much more difficult. *我无权使用MS SQL Server 2012或更高版本-是的,我们落后于时代,是的,这确实使生活更加困难。
SELECT
provider_code, office_location,
CONVERT(INT, count(account)) AS Median
FROM
(
SELECT
office_location,provider_code,
account,appt_date,dept_code,appt_status,appt_class,
ROW_NUMBER( ) OVER (
PARTITION BY office_location,provider_code
ORDER BY account ASC) as RowAsc,
ROW_NUMBER( ) OVER (
PARTITION BY office_location,provider_code
ORDER BY account DESC) as RowDesc
FROM appointments_view WITH(NOLOCK)
WHERE account IS NOT NULL AND appt_date BETWEEN '1/1/17' /*24 month prior*/ AND '1/1/19'
) X
WHERE
RowAsc IN (RowDesc, RowDesc - 1, RowDesc + 1)
GROUP BY office_location,provider_code
ORDER BY office_location,provider_code
For a median you could use the window function PERCENTILE_CONT or PERCENTILE_DISC 对于中位数,可以使用窗口函数PERCENTILE_CONT或PERCENTILE_DISC
(MS Sql Server 2012+) (MS Sql Server 2012以上版本)
Example snippet: 示例片段:
declare @Appointments table (
appt_id int primary key identity(4046100,1),
appt_date date not null default GetDate(),
account int not null,
provider_code varchar(10) not null,
office_location char(3) not null default 'REN',
appt_class char(3) not null
);
insert into @Appointments (appt_date, account, provider_code, appt_class) values
('2019-02-01',100001,'FOO1','IND'),('2019-02-01',100002,'FOO1','IND'),('2019-02-01',100002,'FOO1','PSY'),('2019-02-01',100002,'FOO1','IND'),
('2019-02-01',100002,'FOO1','IND'),('2019-02-01',100003,'FOO1','IND'),('2019-02-01',100003,'FOO1','IND'),('2019-02-01',100003,'FOO1','IND');
select provider_code, office_location, MAX(MedianContTotalAppointments) AS MedianApt
from
(
select provider_code, office_location, account
, count(appt_id) as TotalAppointments
, PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY count(appt_id)) OVER (PARTITION BY provider_code, office_location) AS MedianContTotalAppointments
-- , PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY count(*)) OVER (PARTITION BY provider_code, office_location) AS MedianDiscTotalAppointments
from @Appointments
where account IS NOT NULL
and appt_date BETWEEN cast('2017-02-01' as date) AND cast('2019-02-01' as date)
group by provider_code, office_location, account
) q
group by provider_code, office_location
order by provider_code, office_location;
Returns: 返回:
provider_code office_location MedianApt
FOO1 REN 3
In a MS Sql Server version before 2012, then this example snippet might work: 在2012年之前的MS Sql Server版本中,此示例代码段可能会起作用:
declare @Appointments table (
appt_id int primary key identity(4046100,1),
appt_date date not null default GetDate(),
account int not null,
provider_code varchar(10) not null,
office_location char(3) not null default 'REN',
appt_class char(3) not null
);
insert into @Appointments (appt_date, account, provider_code, appt_class) values
('2019-02-01',100001,'FOO1','IND'),('2019-02-01',100002,'FOO1','IND'),('2019-02-01',100002,'FOO1','PSY'),('2019-02-01',100002,'FOO1','IND')
,('2019-02-01',100002,'FOO1','IND'),('2019-02-01',100003,'FOO1','IND'),('2019-02-01',100003,'FOO1','IND'),('2019-02-01',100003,'FOO1','IND')
--,('2019-02-01',100004,'FOO1','IND'),('2019-02-01',100004,'FOO1','IND')
;
select provider_code, office_location, AVG(TotalAppointments) AS MedianApt
from
(
select provider_code, office_location, account
, COUNT(appt_id) as TotalAppointments
, ROW_NUMBER() OVER (PARTITION BY provider_code, office_location ORDER BY COUNT(appt_id) ASC) AS rn
, COUNT(*) OVER (PARTITION BY provider_code, office_location) AS cnt
from @Appointments
where account IS NOT NULL
and appt_date BETWEEN cast('2017-02-01' as date) AND cast('2019-02-01' as date)
group by provider_code, office_location, account
) q
where rn in (FLOOR((cnt+1)*0.5), CEILING((cnt+1)*0.5))
group by provider_code, office_location
order by provider_code, office_location;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.