简体   繁体   English

使用 T-SQL window 函数从 1 分钟数据中检索 5 分钟平均值

[英]Use T-SQL window functions to retrieve 5-minute averages from 1-minute data

I have a database table containing one-minute periods of Open, Close, High, Low, Volume values for a security.我有一个数据库表,其中包含一分钟的开盘价、收盘价、最高价、最低价、交易量值。 I'm using SQL Server 2017, but 2019 RC is an option.我正在使用 SQL Server 2017,但可以选择 2019 RC。

I am trying to find an efficient SQL Server query that can aggregate these into 5-minute windows, where:我正在尝试找到一个高效的 SQL 服务器查询,可以将这些汇总到 5 分钟的 windows 中,其中:

  • Open = first Open value of the window Open = window 的第一个 Open 值
  • Close = last Close value of the window关闭 = window 的最后关闭值
  • High = max High value of the window高 = window 的最大高值
  • Low = min Low value of the window Low = min window 的低值
  • Volume = avg Volume across the window交易量 = window 的平均交易量

Ideally this query would account for gaps in the data, ie be based on date calculations rather than counting preceding / following rows.理想情况下,此查询将考虑数据中的空白,即基于日期计算,而不是计算前面/后面的行。

For example say I have (here's 6 mins of data):例如说我有(这里是 6 分钟的数据):

| Time             | Open | Close | High | Low | Volume |
|------------------|------|-------|------|-----|--------|
| 2019-10-30 09:30 | 5    | 10    | 15   | 1   | 125000 |
| 2019-10-30 09:31 | 10   | 15    | 20   | 5   | 100000 |
| 2019-10-30 09:32 | 15   | 20    | 25   | 10  | 120000 |
| 2019-10-30 09:33 | 20   | 25    | 30   | 15  | 10000  |
| 2019-10-30 09:34 | 20   | 22    | 40   | 2   | 13122  |
| 2019-10-30 09:35 | 22   | 30    | 35   | 4   | 15000  | Not factored in, since this would be the first row of the next 5-minute window

I am trying to write a query that would give me (here's the first example of the 5-minute aggregate):我正在尝试编写一个可以给我的查询(这是 5 分钟聚合的第一个示例):

| Time             | Open | Close | High | Low | Volume  |
|------------------|------|-------|------|-----|---------|
| 2019-10-30 09:30 | 5    | 30    | 40   | 1   | 50224.4 |

Any tips?有小费吗? Am banging my head against the wall with the OVER clause and its PARTITION / RANGE options我用 OVER 子句及其 PARTITION / RANGE 选项将我的头撞到墙上

The gist of the problem is rounding datetime values to 5 minute boundary which (assuming that the datatype is datetime ) could be done using DATEADD(MINUTE, DATEDIFF(MINUTE, 0, time) / 5 * 5, 0) .问题的要点是将日期时间值四舍五入到 5 分钟边界(假设数据类型为datetime )可以使用DATEADD(MINUTE, DATEDIFF(MINUTE, 0, time) / 5 * 5, 0)来完成。 Rest is basic grouping/window functions: Rest 是基本的分组/窗口功能:

WITH cte AS (
  SELECT clamped_time
       , [Open]
       , [Close]
       , [High]
       , [Low]
       , [Volume]
       , rn1 = ROW_NUMBER() OVER (PARTITION BY clamped_time ORDER BY [Time])
       , rn2 = ROW_NUMBER() OVER (PARTITION BY clamped_time ORDER BY [Time] DESC)
  FROM t
  CROSS APPLY (
      SELECT DATEADD(MINUTE, DATEDIFF(MINUTE, 0, time) / 5 * 5, 0)
  ) AS x(clamped_time)
)
SELECT clamped_time
     , MIN(CASE WHEN rn1 = 1 THEN [Open] END) AS [Open]
     , MIN(CASE WHEN rn2 = 1 THEN [Close] END) AS [Close]
     , MAX([High]) AS [High]
     , MIN([Low]) AS [Low]
     , AVG([Volume])
FROM cte
GROUP BY clamped_time

Demo on db<>fiddle db<>fiddle 上的演示

You want to analyze data by 5 minutes intervals.您希望以 5 分钟为间隔分析数据。 You could use window functions with the following partitioning clause:您可以将 window 函数与以下分区子句一起使用:

partition by datepart(year, t.[time]),
    datepart(month, t.[time]),
    datepart(day, t.[time]),
    datepart(hour, t.[time]),
    (datepart(minute, t.[time]) / 5)

Query:询问:

select *
from (
    select  
        t.time,
        row_number() over(
            partition by datepart(year, [time]),
                datepart(month, [time]),
                datepart(day, [time]),
                datepart(hour, [time]),
                (datepart(minute, [time]) / 5)
            order by [time]
        ) [rn],
        first_value([open]) over(
            partition by datepart(year, [time]),
                datepart(month, [time]),
                datepart(day, [time]),
                datepart(hour, [time]),
                (datepart(minute, [time]) / 5)
            order by [time]
        ) [open],
        last_value([close]) over(
            partition by datepart(year, [time]),
                datepart(month, [time]),
                datepart(day, [time]),
                datepart(hour, [time]),
                (datepart(minute, [time]) / 5)
            order by [time]
        ) [close],
        max([high]) over (
            partition by datepart(year, [time]),
                datepart(month, [time]),
                datepart(day, [time]),
                datepart(hour, [time]),
                (datepart(minute, [time]) / 5)
        ) [high],
        min([low]) over (
            partition by datepart(year, [time]),
                datepart(month, [time]),
                datepart(day, [time]),
                datepart(hour, [time]),
                (datepart(minute, [time]) / 5)
        ) [low],
        avg([volume]) over (
            partition by datepart(year, [time]),
                datepart(month, [time]),
                datepart(day, [time]),
                datepart(hour, [time]),
                (datepart(minute, [time]) / 5)
        ) [volume]
    from mytable t
) t
where rn = 1

you can try this.你可以试试这个。

  SELECT
      MIN([Time]) [Time], 
      Min([Open]) [Open],
      LEAD(Min([Open])) OVER (ORDER BY MIN([Time])) AS [Close],
      Max([High]) [High], 
      Min([Low]) [Low], 
      Avg(Volume) Volume
  FROM SampleData
  GROUP BY DATEADD(Minute, -1* DATEPART(Minute, Time) %5, Time)

sql fiddle sql 小提琴

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM