[英]Counting observations based on individual date ranges
What I am trying to do is, for each individual ("id") at each sample location ("date") to count the number of unique observations within the last 1 year as a new column ("n_sample_1y");我想要做的是,对于每个样本位置(“日期”)的每个人(“id”),将过去 1 年内的唯一观察次数计算为新列(“n_sample_1y”);
Hence I would like to achieve an output like this;因此,我想像这样实现 output;
# A tibble: 6 x 3
id date n_sample_1y
<dbl> <dttm> <dbl>
1 3 2010-01-10 00:00:00 1
2 3 2010-02-15 00:00:00 2
3 3 2010-03-29 00:00:00 3
4 3 2010-03-29 00:00:00 3
5 3 2011-02-16 00:00:00 2
6 3 2011-06-13 00:00:00 2
I have been using the lubridate package to calculate the start date ("s_date") of the date range我一直在使用 lubridate package 来计算日期范围的开始日期(“s_date”)
mutate(s_date= date - lubridate::years(1), sample_no = match(date, unique(date)))
but I can't seem to progress any further.但我似乎无法进一步进步。
Any tips/ideas would kindly be appreciated.任何提示/想法将不胜感激。 Data sample:
数据样本:
df <- structure(list(id = c(3, 3, 3, 3, 3, 4, 4, 4, 5, 5),
date = structure(c(1220572800, 1221004800, 1269820800, 1269820800, 1274227200, 1276387200, 1279756800, 1283904000, 1286668800, 1289779200),
tzone = "UTC", class = c("POSIXct", "POSIXt"))), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))
nc <- ncol(df) + 1
final_df <- data.frame()
for(j in unique(df$id)){
df1 <- df %>% filter(df$id == j)
for(i in 1:nrow(df1)){
df1[i, nc] <- length(unique(intersect(df1$date[df1$date <= df1$date[i]], df1$date[df1$date >= df1$date[i] %m-% years(1)])))
}
final_df <- rbind(final_df, df1)
}
Using lubridate(), dplyr() and making a loop使用lubridate()、dplyr()并创建一个循环
Install and load lubridate and dplyr library first.首先安装并加载lubridate 和 dplyr库。
If I understood properly your question, then here is sample data and the corresponding solution to this problem.如果我正确理解了您的问题,那么这里是示例数据和该问题的相应解决方案。
df <- structure(list(id = c(3, 3, 3, 3, 3, 4, 4, 4, 5, 5),
date = structure(c(1220572800, 1221004800, 1269820800, 1269820800, 1274227200, 1276387200, 1279756800, 1283904000, 1286668800, 1289779200),
tzone = "UTC", class = c("POSIXct", "POSIXt"))), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))
nc <- ncol(df) + 1
final_df <- data.frame()
for(j in unique(df$id)){
df1 <- df %>% filter(df$id == j)
for(i in 1:nrow(df1)){
df1[i, nc] <- length(unique(intersect(df1$date[df1$date <= df1$date[i]], df1$date[df1$date >= df1$date[i] %m-% years(1)])))
}
final_df <- rbind(final_df, df1)
}
Hope this updated code helps.. Happy coding希望这个更新的代码有所帮助..快乐编码
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.