[英]Count the number of months since last observation
My dataframe look like below:我的数据框如下所示:
ID Date Value SinceLastObservation
1 1-1-2010 0 0
1 2-1-2010 0 0
1 3-1-2010 1 0
1 4-1-2010 0 1
1 5-1-2010 1 2
1 7-1-2010 0 2
1 9-1-2010 0 4
2 1-1-2011 1 0
2 2-1-2011 0 1
2 3-1-2011 0 2
2 4-1-2011 1 3
2 6-1-2011 1 2
2 8-1-2011 0 2
I want to add a new column (SinceLastObservation) that counts the number of months from the observations (value "1") in each group (grouped by ID) in R.我想添加一个新列 (SinceLastObservation),它计算 R 中每个组(按 ID 分组)中的观察(值“1”)的月数。
This is my solution, but it is super slow for the big data frame that I have.这是我的解决方案,但对于我拥有的大数据框来说速度非常慢。
DT <- data.table(df)
DT[, grp := cumsum(Value== "1")- Value, by=list(ID)]
DT[, minDate := rollback(min(Date), preserve_hms = FALSE),by=list(ID,grp)]
DT[, Months_since_last_1_30DPD := mondf(minDate,Date),by=list(ID,grp)]
mondf is a function that counts number of months between two dates mondf 是一个计算两个日期之间的月数的函数
The example:这个例子:
data <- data.frame(ID = c("1", "1", "1","1", "1", "1","1" ,"2", "2","2", "2","2", "2"),
Date = c("1-1-2010","2-1-2010","3-1-2010","4-1-2010","5-1-2010","7-1-2010","9-1-2010","1-1-2011","2-1-2011","3-1-2011","4-1-2011","6-1-2011","8-1-2011"),
value = c(0,0,1,0,1,0,0,1,0,0,1,1,0))
Thanks谢谢
Here is a method using lubridate
:这是使用
lubridate
的方法:
library(tidyverse)
library(lubridate)
data %>%
as_tibble %>%
mutate(Date = mdy(Date)) %>%
group_by(ID) %>%
mutate(last_obs = if_else(value==1, Date, NA_Date_),
last_obs = lag(last_obs)) %>%
fill(last_obs) %>%
mutate(months_since_last_obs = (last_obs %--% Date)/months(1)) %>%
replace_na(list(months_since_last_obs = 0))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.