简体   繁体   English

如何计算日期列中的平均值

[英]How to calculate average in date column

I don't know how to calculate the average age of a column of type date in SQL Server.我不知道如何计算 SQL 服务器中日期类型列的平均年龄。

You can use datediff() and aggregation.您可以使用datediff()和聚合。 Assuming that your date column is called dt in table mytable , and that you want the average age in years over the whole table, then you would do:假设您的日期列在表mytable中称为dt ,并且您想要整个表的平均年龄(以年为单位),那么您将执行以下操作:

select avg(datediff(year, dt, getdate())) avg_age
from mytable

You can change the first argument to datediff() (which is called the date part), to any other supported value depending on what you actually mean by age ;您可以将datediff()的第一个参数(称为日期部分)更改为任何其他受支持的值,具体取决于您的实际含义age for example datediff(day, dt, getdate()) gives you the difference in days.例如datediff(day, dt, getdate())给你天的差异。

First, lets calculate the age in years correctly.首先,让我们正确计算年龄。 See the comments in the code with the understanding that DATEDIFF does NOT calculate age.请参阅代码中的注释,了解 DATEDIFF 不计算年龄。 It only calculates the number of temporal boundaries that it crosses.它只计算它跨越的时间边界的数量。

--===== Local obviously named variables defined and assigned
DECLARE  @StartDT DATETIME = '2019-12-31 23:59:59.997'
        ,@EndDT   DATETIME = '2020-01-01 00:00:00.000'
;
--===== Show the difference in milliseconds between the two date/times
     -- Because of the rounding that DATETIME does on 3.3ms resolution, this will return 4ms,
     -- which certainly does NOT depict an age of 1 year.
 SELECT DATEDIFF(ms,@StartDT,@EndDT)
;
--===== This solution will mistakenly return an age of 1 year for the dates given,
     -- which are only about 4ms apart according the SELECT above.
 SELECT IncorrectAgeInYears = DATEDIFF(YEAR, @StartDT, @EndDT)
;
--===== This calulates the age in years correctly in T-SQL.
     -- If the anniversary data has not yet occurred, 1 year is substracted.
 SELECT CorrectAgeInYears = DATEDIFF(yy, @StartDT, @EndDT) 
                          - IIF(DATEADD(yy, DATEDIFF(yy, @StartDT, @EndDT), @StartDT) > @EndDT, 1, 0)
;

Now, lets turn that correct calculation into a Table Valued Function that returns a single scalar value producing a really high speed "Inline Scalar Function".现在,让我们将正确的计算转换为表值 Function,它返回单个标量值,产生一个非常高速的“内联标量函数”。

 CREATE FUNCTION [dbo].[AgeInYears]
        (
        @StartDT DATETIME, --Date of birth or date of manufacture or start date.
        @EndDT   DATETIME  --Usually, GETDATE() or CURRENT_TIMESTAMP but
                           --can be any date source like a column that has an end date.
        )
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
 SELECT AgeInYears = DATEDIFF(yy, @StartDT, @EndDT) 
                   - IIF(DATEADD(yy, DATEDIFF(yy, @StartDT, @EndDT), @StartDT) > @EndDT, 1, 0)
;

Then, to Dale's point, let's create a test table and populate it.然后,就 Dale 而言,让我们创建一个测试表并填充它。 This one is a little overkill for this problem but it's also useful for a lot of different examples.这对于这个问题来说有点矫枉过正,但它对于许多不同的例子也很有用。 Don't let the million rows scare you... this runs in just over 2 seconds on my laptop including the Clustered Index creation.不要让百万行吓到你……这在我的笔记本电脑上运行只需 2 秒多,包括创建聚集索引。

--===== Create and populate a large test table on-the-fly.
     -- "SomeInt" has a range of 1 to 50,000 numbers
     -- "SomeLetters2" has a range of "AA" to "ZZ" 
     -- "SomeDecimal has a range of 10.00 to 100.00 numbers
     -- "SomeDate" has a range of >=01/01/2000 & <01/01/2020 whole dates
     -- "SomeDateTime" has a range of >=01/01/2000 & <01/01/2020 Date/Times
     -- "SomeRand" contains the value of RAND just to show it can be done without a loop.
     -- "SomeHex9" contains 9 hex digits from NEWID()
     -- "SomeFluff" is a fixed width CHAR column just to give the table a little bulk.
 SELECT TOP 1000000
         SomeInt        = ABS(CHECKSUM(NEWID())%50000) + 1
        ,SomeLetters2   = CHAR(ABS(CHECKSUM(NEWID())%26) + 65)
                        + CHAR(ABS(CHECKSUM(NEWID())%26) + 65)
        ,SomeDecimal    = CAST(RAND(CHECKSUM(NEWID())) * 90 + 10 AS DECIMAL(9,2))
        ,SomeDate       = DATEADD(dd, ABS(CHECKSUM(NEWID())%DATEDIFF(dd,'2000','2020')), '2000')
        ,SomeDateTime   = DATEADD(dd, DATEDIFF(dd,0,'2000'), RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2000','2020'))
        ,SomeRand       = RAND(CHECKSUM(NEWID()))  --CHECKSUM produces an INT and is MUCH faster than conversion to VARBINARY.
        ,SomeHex9       = RIGHT(NEWID(),9)
        ,SomeFluff      = CONVERT(CHAR(170),'170 CHARACTERS RESERVED') --Just to add a little bulk to the table.
   INTO dbo.JBMTest
   FROM      sys.all_columns ac1 --Cross Join forms up to a 16 million rows
  CROSS JOIN sys.all_columns ac2 --Pseudo Cursor
;
GO
--===== Add a non-unique Clustered Index to SomeDateTime for this demo.
 CREATE CLUSTERED INDEX IXC_Test ON dbo.JBMTest (SomeDateTime ASC)
;

Now, lets find the average age of those million represented by the SomeDateTime column.现在,让我们找出 SomeDateTime 列所代表的那百万人的平均年龄。

 SELECT  AvgAgeInYears = AVG(age.AgeInYears )
        ,RowsCounted   = COUNT(*)
   FROM dbo.JBMTest tst
  CROSS APPLY dbo.AgeInYears(SomeDateTime,GETDATE()) age
;

Results:结果:

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM