[英]Calculating years since first observation in monthly time-series R in data.table
我試圖找到一種解決方案來使用 data.table 計算自每月時間序列開始以來的年數。 我的數據框如下所示:
library(data.table)
structure(list(Date = structure(c(8581, 8611, 8643, 8673, 8702,
8734, 8765, 8796, 8824, 8855, 8884, 8916, 8946, 8975, 9008), class = "Date")), row.names = c(NA,
-15L), class = c("data.table", "data.frame"))
Date
1: 1993-06-30
2: 1993-07-30
3: 1993-08-31
4: 1993-09-30
5: 1993-10-29
6: 1993-11-30
7: 1993-12-31
8: 1994-01-31
9: 1994-02-28
10: 1994-03-31
11: 1994-04-29
12: 1994-05-31
13: 1994-06-30
14: 1994-07-29
15: 1994-08-31
這些只是 15 個觀察值,但它們可以多達數百個。 我想創建一個列來計算自時間序列開始以來已經過去的年數,如下所示。 為了簡化問題,我想從 0 開始計數,然后每 12 個月加 1。
Date Years
1: 1993-06-30 0
2: 1993-07-30 0
3: 1993-08-31 0
4: 1993-09-30 0
5: 1993-10-29 0
6: 1993-11-30 0
7: 1993-12-31 0
8: 1994-01-31 0
9: 1994-02-28 0
10: 1994-03-31 0
11: 1994-04-29 0
12: 1994-05-31 0
13: 1994-06-30 1
14: 1994-07-29 1
15: 1994-08-31 1
Note that the time-series starts in June. The beginning can be any month.
一種可能的解決方案是使用查找表進行滾動連接
library(data.table)
lut <- setDT(df1)[, .(Date = seq(min(Date), max(Date), by = "1 year"))][
, Years := seq_along(Date) - 1L][]
lut[df1, on = .(Date), roll = Inf]
Date Years 1: 1993-06-30 0 2: 1993-07-30 0 3: 1993-08-31 0 4: 1993-09-30 0 5: 1993-10-29 0 6: 1993-11-30 0 7: 1993-12-31 0 8: 1994-01-31 0 9: 1994-02-28 0 10: 1994-03-31 0 11: 1994-04-29 0 12: 1994-05-31 0 13: 1994-06-30 1 14: 1994-07-29 1 15: 1994-08-31 1
查找表是
lut
Date Years 1: 1993-06-30 0 2: 1994-06-30 1
df[, years:=floor(as.numeric(Date-min(df$Date))/365.25)]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.