简体   繁体   中英

SQL Server : average count of alerts per day, not including days with no alerts

I have a table that acts as a message log, with the two key tables being TIMESTAMP and TEXT . I'm working on a query that grabs all alerts (from TEXT ) for the past 30 days (based on TIMESTAMP ) and gives a daily average for those alerts.

Here is the query so far:

--goback 30 days start at midnight
declare @olderdate as datetime
set @olderdate = DATEADD(Day, -30, DATEDIFF(Day, 0, GetDate()))

--today at 11:59pm
declare @today as datetime
set @today = dateadd(ms, -3, (dateadd(day, +1, convert(varchar, GETDATE(), 101))))
print @today

--Grab average alerts per day over 30 days
select 
    avg(x.Alerts * 1.0 / 30)
from
    (select count(*) as Alerts 
     from MESSAGE_LOG 
     where text like 'The process%' 
       and text like '%has alerted%'
       and TIMESTAMP between @olderdate and @today) X

However, I want to add something that checks whether there were any alerts for a day and, if there are no alerts for that day, doesn't include it in the average. For example, if there are 90 alerts for a month but they're all in one day, I wouldn't want the average to be 3 alerts per day since that's clearly misleading.

Is there a way I can incorporate this into my query? I've searched for other solutions to this but haven't been able to get any to work.

This isn't written for your query, as I don't have any DDL or sample data, thus I'm going to provide a very simple example instead of how you would do this.

USE Sandbox;
GO

CREATE TABLE dbo.AlertMessage (ID int IDENTITY(1,1),
                               AlertDate date);

INSERT INTO dbo.AlertMessage (AlertDate)
VALUES('20190101'),('20190101'),('20190105'),('20190110'),('20190115'),('20190115'),('20190115');
GO

--Use a CTE to count per day:
WITH Tots AS (
    SELECT AlertDate,
           COUNT(ID) AS Alerts
    FROM dbo.AlertMessage
    GROUP BY AlertDate)
--Now the average
SELECT AVG(Alerts*1.0) AS DayAverage
FROM Tots;
GO

--Clean up
DROP TABLE dbo.AlertMessage;

Instead of dividing by 30 to get the average, divide by the count of distinct days in your results.

select 
    avg(x.Alerts * 1.0 / x.dd)
from
    (select count(*) as Alerts, count(distinct CAST([TIMESTAMP] AS date)) AS dd
...

You're trying to compute a double-aggregate: The average of daily totals.

Without using a CTE , you can try this as well, which is generalized a bit more to work for multiple months.

--get a list of events per day
DECLARE @Event TABLE
(
    ID INT NOT NULL IDENTITY(1, 1)
    ,DateLocalTz DATE NOT NULL--make sure to handle time zones
    ,YearLocalTz AS DATEPART(YEAR, DateLocalTz) PERSISTED
    ,MonthLocalTz AS DATEPART(MONTH, DateLocalTz) PERSISTED
)
/*
INSERT INTO @Event(EntryDateLocalTz)
SELECT DISTINCT CONVERT(DATE, TIMESTAMP)--presumed to be in your local time zone because you did not specify
FROM dbo.MESSAGE_LOG
WHERE UPPER([TEXT]) LIKE 'THE PROCESS%' AND UPPER([TEXT]) LIKE '%HAS ALERTED%'--case insenitive
*/
INSERT INTO @Event(DateLocalTz)
VALUES ('2018-12-31'), ('2019-01-01'), ('2019-01-01'), ('2019-01-01'), ('2019-01-12'), ('2019-01-13')
--get average number of alerts per alerting day each month
--  (this will not return months with no alerts,
--  use a LEFT OUTER JOIN against a month list table if you need to include uneventful months)
SELECT
    YearLocalTz
    ,MonthLocalTz
    ,AvgAlertsOfAlertingDays = AVG(CONVERT(REAL, NumDailyAlerts))
FROM
    (
        SELECT
            YearLocalTz
            ,MonthLocalTz
            ,DateLocalTz
            ,NumDailyAlerts = COUNT(*)
        FROM @Event
        GROUP BY YearLocalTz, MonthLocalTz, DateLocalTz
    ) AS X
GROUP BY YearLocalTz, MonthLocalTz
ORDER BY YearLocalTz ASC, MonthLocalTz ASC

Some things to note in my code:

  1. I use PERSISTED columns to get the month and year date parts (because I'm lazy when populating tables)
  2. Use explicit CONVERT to escape integer math that rounds down decimals. Multiplying by 1.0 is a less-readable hack.
  3. Use CONVERT(DATE, ...) to round down to midnight instead of converting back and forth between strings
  4. Do case-insensitive string searching by making everything uppercase (or lowercase, your preference)
  5. Don't subtract 3 milliseconds to get the very last moment before midnight. Change your semantics to interpret the end of a time range as exclusive, instead of dealing with the precision of your datatypes. The only difference is using explicit comparators (ie use < instead of <= ). Also, DATETIME resolution is 1/300 th of a second , not 3 milliseconds.
  6. Avoid using built-in keywords as column names (ie "TEXT"). If you do, wrap them in square brackets to avoid ambiguity.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM