[英]Get the last N periods from a date column, regardless the period (year, month, day, hour, etc.)
我有不同的數據表。 在某些表格中,數據每季度加載一次,而在其他表格中則每月/每天加載一次,等等。
每個表都有 ReportedDate 列。 我喜歡做的是能夠只過濾最后 N 個時期。 例如,如果是幾天,則為最后 3 天。 問題是我不能使用GETDATE() - 3
例如,因為數據是在工作日而不是假期和周末加載的。
我曾嘗試使用ROW_NUMBER()
PARTITION BY
ReportedDate 但它的工作速度非常慢。 我會很感激建議。 表格示例:
+-----------+-----------------------------+
| Indicator | ReportedDate |
+-----------+-----------------------------+
| 0.2917 | 2020-08-12 00:00:00.0000000 |
| 0.261919 | 2020-08-13 00:00:00.0000000 |
| 0.259211 | 2020-08-14 00:00:00.0000000 |
| 0.201075 | 2020-08-17 00:00:00.0000000 |
| 0.250153 | 2020-08-18 00:00:00.0000000 |
| 0.333093 | 2020-08-19 00:00:00.0000000 |
| 0.976495 | 2020-08-20 00:00:00.0000000 |
| 0.759739 | 2020-08-21 00:00:00.0000000 |
| 1.17279 | 2020-08-24 00:00:00.0000000 |
| 0.285365 | 2020-08-25 00:00:00.0000000 |
+-----------+-----------------------------+
SELECT *
FROM (SELECT Indicator, ReportedDate, ROW_NUMBER() OVER(PARTITION BY ReportedDate ORDER BY ReportedDate desc) as periods
FROM indicatorTable) a
where periods <= 2
另一個例子 - 股票價格表:
+--------+--------+-------------------------+
| Ticker | Price | Date |
+--------+--------+-------------------------+
| AAPL | 116.03 | 2020-11-25 00:00:00.000 |
| AAPL | 115.17 | 2020-11-24 00:00:00.000 |
| AAPL | 113.85 | 2020-11-23 00:00:00.000 |
| AAPL | 117.34 | 2020-11-20 00:00:00.000 |
| AAPL | 118.64 | 2020-11-19 00:00:00.000 |
| AAPL | 118.03 | 2020-11-18 00:00:00.000 |
| AAPL | 119.39 | 2020-11-17 00:00:00.000 |
| AAPL | 120.3 | 2020-11-16 00:00:00.000 |
| AAPL | 119.26 | 2020-11-13 00:00:00.000 |
| AAPL | 119.21 | 2020-11-12 00:00:00.000 |
| IBM | 124.2 | 2020-11-25 00:00:00.000 |
| IBM | 124.42 | 2020-11-24 00:00:00.000 |
| IBM | 120.09 | 2020-11-23 00:00:00.000 |
| IBM | 116.94 | 2020-11-20 00:00:00.000 |
| IBM | 117.18 | 2020-11-19 00:00:00.000 |
| IBM | 116.77 | 2020-11-18 00:00:00.000 |
| IBM | 117.7 | 2020-11-17 00:00:00.000 |
| IBM | 118.36 | 2020-11-16 00:00:00.000 |
| IBM | 116.85 | 2020-11-13 00:00:00.000 |
| IBM | 114.5 | 2020-11-12 00:00:00.000 |
| MSFT | 213.87 | 2020-11-25 00:00:00.000 |
| MSFT | 213.86 | 2020-11-24 00:00:00.000 |
| MSFT | 210.11 | 2020-11-23 00:00:00.000 |
| MSFT | 210.39 | 2020-11-20 00:00:00.000 |
| MSFT | 212.42 | 2020-11-19 00:00:00.000 |
| MSFT | 211.08 | 2020-11-18 00:00:00.000 |
| MSFT | 214.46 | 2020-11-17 00:00:00.000 |
| MSFT | 217.23 | 2020-11-16 00:00:00.000 |
| MSFT | 216.51 | 2020-11-13 00:00:00.000 |
| MSFT | 215.44 | 2020-11-12 00:00:00.000 |
+--------+--------+-------------------------+
我想要的是獲取最后兩個時期的結果,在這種情況下:
+--------+--------+-------------------------+
| Ticker | Price | Date |
+--------+--------+-------------------------+
| AAPL | 116.03 | 2020-11-25 00:00:00.000 |
| AAPL | 115.17 | 2020-11-24 00:00:00.000 |
| IBM | 124.2 | 2020-11-25 00:00:00.000 |
| IBM | 124.42 | 2020-11-24 00:00:00.000 |
| MSFT | 213.87 | 2020-11-25 00:00:00.000 |
| MSFT | 213.86 | 2020-11-24 00:00:00.000 |
+--------+--------+-------------------------+
使用dense_rank 代替row_number
SELECT *
FROM (SELECT Indicator, ReportedDate, dense_rank() OVER(PARTITION BY (select 1) ORDER BY ReportedDate desc) as periods
FROM @t) a
where periods <= 2
如果:
declare
@t table (Indicator decimal(37,12), ReportedDate datetime)
insert into @t
select 0.2917 , cast('2020-08-12 00:00:00' as datetime)
union
select 0.261919 , cast('2020-08-13 00:00:00' as datetime)
union
select 0.259211 , cast('2020-08-14 00:00:00' as datetime)
union
select 0.201075 , cast('2020-08-17 00:00:00' as datetime)
union
select 0.250153 , cast('2020-08-18 00:00:00' as datetime)
union
select 0.333093 , cast('2020-08-19 00:00:00' as datetime)
union
select 0.976495 , cast('2020-08-20 00:00:00' as datetime)
union
select 0.759739 , cast('2020-08-21 00:00:00' as datetime)
union
select 1.17279 , cast('2020-08-24 00:00:00' as datetime)
union
select 0.285365, cast('2020-08-25 00:00:00' as datetime)
select top 3 * from @t
order by 2 desc
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.