简体   繁体   中英

SQL Server - Select all top of the hour records

I have a large table with records created every second and want to select only those records that were created at the top of each hour for the last 2 months. So we would get 24 selected records for every day over the last 60 days

The table structure is Dateandtime, Value1, Value2, etc

Many Thanks

You could group by on the date part ( cast(col1 as date) ) and the hour part ( datepart(hh, col1) . Then pick the minimum date for each hour, and filter on that:

select  *
from    YourTable yt
join    (
        select  min(dateandtime) as dt
        from    YourTable
        where   datediff(day, dateandtime, getdate()) <= 60
        group by
                cast(dateandtime as date)
        ,       datepart(hh, dateandtime)
        ) filter
on      filter.dt = yt.dateandtime

Alternatively, you can group on a date format that only includes the date and the hour. For example, convert(varchar(13), getdate(), 120) returns 2013-05-11 18 .

        ...
        group by
                convert(varchar(13), getdate(), 120)
        ) filter
        ...

For clarity's sake, I would probably use a two-step, CTE-based approach ( this works in SQL Server 2005 and newer - you didn't clearly specify which version of SQL Server you're using, so I'm just hoping you're not on an ancient version like 2000 anymore ):

-- define a "base" CTE to get the hour component of your "DateAndTime" 
-- column and make it accessible under its own name
;WITH BaseCTE AS
(
    SELECT  
        ID, DateAndTime,
        Value1, Value2, 
        HourPart = DATEPART(HOUR, DateAndTime)
    FROM dbo.YourTable
    WHERE DateAndTime >= @SomeThresholdDateHere
), 
-- define a second CTE which "partitions" the data by this "HourPart",
-- and number all rows for each partition starting at 1. So each "last"
-- event for each hour is the one with the RN = 1 value
HourlyCTE AS 
(
    SELECT ID, DateAndTime, Value1, Value2, 
        RN = ROW_NUMBER() OVER(PARTITION BY HourPart ORDER BY DateAndTime DESC)
    FROM BaseCTE
)
SELECT *
FROM HourlyCTE
WHERE RN=1

Also: I wasn't sure what exactly you mean by "top of the hour" - the row that's been created right at the beginning of each hour (eg at 04:00:00 ) - or rather the last row created in that hour's time span? If you mean the first one for each hour - then you'd need to change the ORDER BY DateAndTime DESC to ORDER BY DateAndTime ASC

You can use option with EXISTS operator

SELECT *
FROM dbo.tableName t 
WHERE t.DateAndTime >= @YourDateCondition
  AND EXISTS (
              SELECT 1
              FROM dbo.tableName t2
              WHERE t2.Dateandtime >= DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime), 0)
                      AND t2.Dateandtime < DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime)+1, 0)
              HAVING MAX(t2.Dateandtime) = t.Dateandtime
              )

OR option with CROSS APPLY operator

SELECT *
FROM dbo.test83 t CROSS APPLY (
                               SELECT 1
                               FROM dbo.test83 t2
                               WHERE t2.Dateandtime >= DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime), 0)
                                     AND t2.Dateandtime < DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime)+1, 0)
                               HAVING  MAX(t2.Dateandtime) = t.Dateandtime                            
                               ) o(IsMatch)
WHERE t.DateAndTime >= @YourDateCondition 

For improving performance use this index:

CREATE INDEX x ON dbo.test83(DateAndTime) INCLUDE(Value1, Value2)

You can use window functions for this:

select dateandtime, val1, val2, . . .
from (select t.*,
             row_number() over (partition by cast(dateandtime as date), hour(dateandtime)
                                order by dateandtime
                               ) as seqnum
      from t
     ) t
where seqnum = 1

The function row_number() assigns a sequential number to each group defined by the partition clause -- in this case each hour of each day. Within this group, it orders by the dateandtime value, so the one closest to the top of the hour gets a value of 1. The outer query just selects this one record for each group.

You may need an additional filter clause to get records in the last 60 days. Use this in the subquery:

where dateandtime >= getdate() - 60

Try:

select * from mytable
where datepart(mi, dateandtime)=0 and 
      datepart(ss, dateandtime)=0 and
      datediff(d, dateandtime, getdate()) <=60

This helped me get the top of the hour. Anything that ends in ":00:00".

WHERE (CAST(DATETIME as VARCHAR(19))) LIKE '%:00:00'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM