[英]SQL Server - Group By, Average and Percentiles
我在SQL Server中有一個FormSummaries表,其中包含示例數據的以下相關列:
FormName | CompletionTime
Form1 | 70
Form1 | 20
Form1 | 30
Form1 | 40
Form1 | 80
Form1 | 60
Form1 | 90
Form1 | 10
Form2 | 30
Form2 | 40
Form2 | 80
Form2 | 90
Form2 | 40
Form2 | 1000
Form2 | 120
Form2 | 70
我需要做的是:
1)按表單名稱和該表單的平均完成時間分組數據,這很容易:
SELECT
FormName, AVG(CompletionTime)
FROM
FormSummaries
WHERE
CompletionTime is not null
GROUP BY
FormName
2)獲取每種表格類型的最高25%/最低25%的完成時間的平均值(即,完成表格所需的平均最快和最慢25%的時間)。 理想情況下,這將在一個查詢中,即
FormName | Bottom25%AverageCompletionTime | Top25%AverageCompletionTime
Form1 | 85 | 15
Form2 | 560 | 35
我生活在現實世界中,意識到可能不可能,因此分別查詢頂部和底部會很好,例如
FormName | Bottom25%AverageCompletionTime
Form1 | 85
Form2 | 560
FormName | Top25%AverageCompletionTime
Form1 | 15
Form2 | 35
我已經看過Ntile和Over的Partition by,但是我似乎無法獲得任何東西來產生想要的結果(盡管那很可能是因為我沒有正確實現這些!)。
有人可以幫忙嗎?
非常感謝。
NTILE居塊的結果,所以你有興趣在宿舍,所以使用NTILE(4)分成4組,分區在表格名稱。 為此,請嘗試2個查詢
-- top 25%
SELECT formname, AVG(CompletionTime)
FROM
(SELECT
FormName,completiontime, NTILE(4) over (partition by FormName order by completiontime) as QuartPercentile
FROM
FormSummaries
WHERE CompletionTime IS NOT NULL )
x
WHERE QuartPercentile = 1
GROUP BY formname
-- bottom 25%
SELECT formname, AVG(CompletionTime)
FROM
(SELECT
FormName,completiontime, NTILE(4) over (partition by FormName order by completiontime) as QuartPercentile
FROM
FormSummaries
WHERE CompletionTime IS NOT NULL)
x
WHERE QuartPercentile = 4
GROUP BY formname
或一次查詢
SELECT formname,AVG( case when QuartPercentile = 4 then CompletionTime else null end) as [Bottom25%AverageCompletionTime]
, AVG( case when QuartPercentile = 1 then CompletionTime else null end) as [Top25%AverageCompletionTime]
FROM
(SELECT
FormName,completiontime, NTILE(4) over (partition by FormName order by completiontime) as QuartPercentile
FROM
FormSummaries
WHERE CompletionTime IS NOT NULL)
x
GROUP BY formname
請記住,如果您的completetimetime列包含整數,AVG將返回一個整數,因此您可能需要轉換以獲取所需的精度,例如
AVG( case when QuartPercentile = 1 then cast(CompletionTime AS decimal(9,2)) else null end)
您可以使用CTE + PIVOT:
;WITH PercentCount AS (
SELECT FormName,
COUNT(*)/4 as [Bottom25Percent],
COUNT(*) as [Top25Percent]
FROM Forms
GROUP BY FormName
), FormsWithRowNumber AS (
SELECT f.FormName,
f.CompletionTime,
ROW_NUMBER() OVER (PARTITION BY f.FormName ORDER BY f.CompletionTime) as rn
FROM Forms f
), final AS (
SELECT f.FormName,
f.CompletionTime,
CASE WHEN f.rn between 1 and [Bottom25Percent] THEN 1
WHEN f.rn between [Top25Percent]-[Bottom25Percent]+1 and [Top25Percent] THEN 2
ELSE 0 END as [TopBottom]
FROM FormsWithRowNumber f
INNER JOIN PercentCount p
ON p.FormName = f.FormName
)
SELECT *
FROM final
PIVOT (
AVG(CompletionTime) FOR TopBottom IN ([1],[2])
) as pvt
輸出:
FormName Top25%AverageCompletionTime Bottom25%AverageCompletionTime
Form1 15 85
Form2 35 560
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.