簡體   English   中英

在 data.table 中計算月度時間序列 R 中自第一次觀察以來的年數

[英]Calculating years since first observation in monthly time-series R in data.table

我試圖找到一種解決方案來使用 data.table 計算自每月時間序列開始以來的年數。 我的數據框如下所示:

library(data.table)

structure(list(Date = structure(c(8581, 8611, 8643, 8673, 8702, 
8734, 8765, 8796, 8824, 8855, 8884, 8916, 8946, 8975, 9008), class = "Date")), row.names = c(NA, 
-15L), class = c("data.table", "data.frame"))

          Date
 1: 1993-06-30
 2: 1993-07-30
 3: 1993-08-31
 4: 1993-09-30
 5: 1993-10-29
 6: 1993-11-30
 7: 1993-12-31
 8: 1994-01-31
 9: 1994-02-28
10: 1994-03-31
11: 1994-04-29
12: 1994-05-31
13: 1994-06-30
14: 1994-07-29
15: 1994-08-31

這些只是 15 個觀察值,但它們可以多達數百個。 我想創建一個列來計算自時間序列開始以來已經過去的年數,如下所示。 為了簡化問題,我想從 0 開始計數,然后每 12 個月加 1。

          Date  Years
 1: 1993-06-30    0
 2: 1993-07-30    0
 3: 1993-08-31    0
 4: 1993-09-30    0
 5: 1993-10-29    0
 6: 1993-11-30    0
 7: 1993-12-31    0
 8: 1994-01-31    0
 9: 1994-02-28    0    
10: 1994-03-31    0
11: 1994-04-29    0
12: 1994-05-31    0
13: 1994-06-30    1
14: 1994-07-29    1
15: 1994-08-31    1

Note that the time-series starts in June. The beginning can be any month.

一種可能的解決方案是使用查找表進行滾動連接

library(data.table)
lut <- setDT(df1)[, .(Date = seq(min(Date), max(Date), by = "1 year"))][
  , Years := seq_along(Date) - 1L][]
lut[df1, on = .(Date), roll = Inf]
 Date Years 1: 1993-06-30 0 2: 1993-07-30 0 3: 1993-08-31 0 4: 1993-09-30 0 5: 1993-10-29 0 6: 1993-11-30 0 7: 1993-12-31 0 8: 1994-01-31 0 9: 1994-02-28 0 10: 1994-03-31 0 11: 1994-04-29 0 12: 1994-05-31 0 13: 1994-06-30 1 14: 1994-07-29 1 15: 1994-08-31 1

查找表是

lut
 Date Years 1: 1993-06-30 0 2: 1994-06-30 1
df[, years:=floor(as.numeric(Date-min(df$Date))/365.25)]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM