简体   繁体   English

在 data.table 中计算月度时间序列 R 中自第一次观察以来的年数

[英]Calculating years since first observation in monthly time-series R in data.table

I am trying to find a solution to calculate the years since the beginning of a monthly time-series using data.table.我试图找到一种解决方案来使用 data.table 计算自每月时间序列开始以来的年数。 My dataframe looks like this:我的数据框如下所示:

library(data.table)

structure(list(Date = structure(c(8581, 8611, 8643, 8673, 8702, 
8734, 8765, 8796, 8824, 8855, 8884, 8916, 8946, 8975, 9008), class = "Date")), row.names = c(NA, 
-15L), class = c("data.table", "data.frame"))

          Date
 1: 1993-06-30
 2: 1993-07-30
 3: 1993-08-31
 4: 1993-09-30
 5: 1993-10-29
 6: 1993-11-30
 7: 1993-12-31
 8: 1994-01-31
 9: 1994-02-28
10: 1994-03-31
11: 1994-04-29
12: 1994-05-31
13: 1994-06-30
14: 1994-07-29
15: 1994-08-31

These are just 15 observations, but they can go up to hundreds.这些只是 15 个观察值,但它们可以多达数百个。 I would like to create a column that calculates the years that have gone by since the beginning of the time series, like the one below.我想创建一个列来计算自时间序列开始以来已经过去的年数,如下所示。 To simplify the issue, I want to start counting at 0 and add 1 every 12 months.为了简化问题,我想从 0 开始计数,然后每 12 个月加 1。

          Date  Years
 1: 1993-06-30    0
 2: 1993-07-30    0
 3: 1993-08-31    0
 4: 1993-09-30    0
 5: 1993-10-29    0
 6: 1993-11-30    0
 7: 1993-12-31    0
 8: 1994-01-31    0
 9: 1994-02-28    0    
10: 1994-03-31    0
11: 1994-04-29    0
12: 1994-05-31    0
13: 1994-06-30    1
14: 1994-07-29    1
15: 1994-08-31    1

Note that the time-series starts in June. The beginning can be any month.

One possible solution is a rolling join with a lookup table一种可能的解决方案是使用查找表进行滚动连接

library(data.table)
lut <- setDT(df1)[, .(Date = seq(min(Date), max(Date), by = "1 year"))][
  , Years := seq_along(Date) - 1L][]
lut[df1, on = .(Date), roll = Inf]
 Date Years 1: 1993-06-30 0 2: 1993-07-30 0 3: 1993-08-31 0 4: 1993-09-30 0 5: 1993-10-29 0 6: 1993-11-30 0 7: 1993-12-31 0 8: 1994-01-31 0 9: 1994-02-28 0 10: 1994-03-31 0 11: 1994-04-29 0 12: 1994-05-31 0 13: 1994-06-30 1 14: 1994-07-29 1 15: 1994-08-31 1

The lookup table is查找表是

lut
 Date Years 1: 1993-06-30 0 2: 1994-06-30 1
df[, years:=floor(as.numeric(Date-min(df$Date))/365.25)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM