[英]Taking an average in SQL after throwing away outliers
我有一个通用的日志表,可以将其附加到进程及其结果。 我使用过程性能视图获得平均时间:
WITH Events
AS (
SELECT PR.DATA_DT_ID
,P.ProcessID
,P.ProcessName
,PL.GUID
,PL.EventText
,PL.EventTime
FROM MISProcess.ProcessResults AS PR
INNER JOIN MISProcess.ProcessResultTypes AS PRT
ON PRT.ResultTypeID = PR.ResultTypeID
AND PRT.IsCompleteForTiming = 1
INNER JOIN MISProcess.Process AS P
ON P.ProcessID = PR.ProcessID
INNER JOIN MISProcess.ProcessLog AS PL
ON PL.BatchRunID = PR.BatchRunID
AND PL.ProcessID = P.ProcessID
AND [GUID] IS NOT NULL
AND (
PL.EventText LIKE 'Process Starting:%'
OR PL.EventText LIKE 'Process Complete:%'
)
)
SELECT Start.DATA_DT_ID
,Start.ProcessName
,AVG(DATEDIFF(SECOND, Start.EventTime, Finish.EventTime)) AS AvgDurationSeconds
,COUNT(*) AS NumRuns
FROM Events AS Start
INNER JOIN Events AS Finish
ON Start.EventText LIKE 'Process Starting:%'
AND Finish.EventText LIKE 'Process Complete:%'
AND Start.DATA_DT_ID = Finish.DATA_DT_ID
AND Start.ProcessID = Finish.ProcessID
AND Start.GUID = Finish.GUID
GROUP BY Start.DATA_DT_ID
,Start.ProcessName
GUID在其他“注释”样式条目之间链接了开始和结束条目。
现在,我可以对此进行过滤,以消除过去几个月的运行时间,因此,例如,流程的平均性能只能在最近三个月内得出。
当我由于性能不佳或调试而出现异常值时,问题就来了,该过程在0秒之内完成。
我想以某种方式自动消除任何异常值。
VAR()
或STDEV()
聚合函数可以工作吗?
聚合函数将忽略NULL( COUNT(*)
除外),因此,如果您可以将表达式中的离群值转换为NULL,那将会有所帮助。
AVG( CASE WHEN Start.EventTime = Finish.EventTime THEN NULL
ELSE DATEDIFF(SECOND, Start.EventTime, Finish.EventTime)
END CASE )
在没有详细分析查询的情况下,我的第一个想法是:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.