简体   繁体   English

根据单个日期范围计算观察值

[英]Counting observations based on individual date ranges

What I am trying to do is, for each individual ("id") at each sample location ("date") to count the number of unique observations within the last 1 year as a new column ("n_sample_1y");我想要做的是,对于每个样本位置(“日期”)的每个人(“id”),将过去 1 年内的唯一观察次数计算为新列(“n_sample_1y”);

Hence I would like to achieve an output like this;因此,我想像这样实现 output;

# A tibble: 6 x 3
     id date                n_sample_1y
  <dbl> <dttm>                    <dbl>
1     3 2010-01-10 00:00:00           1
2     3 2010-02-15 00:00:00           2
3     3 2010-03-29 00:00:00           3
4     3 2010-03-29 00:00:00           3
5     3 2011-02-16 00:00:00           2
6     3 2011-06-13 00:00:00           2 

I have been using the lubridate package to calculate the start date ("s_date") of the date range我一直在使用 lubridate package 来计算日期范围的开始日期(“s_date”)

mutate(s_date= date - lubridate::years(1), sample_no = match(date, unique(date)))

but I can't seem to progress any further.但我似乎无法进一步进步。

Any tips/ideas would kindly be appreciated.任何提示/想法将不胜感激。 Data sample:数据样本:

df <- structure(list(id = c(3, 3, 3, 3, 3, 4, 4, 4, 5, 5), 
               date = structure(c(1220572800, 1221004800, 1269820800, 1269820800, 1274227200, 1276387200, 1279756800, 1283904000, 1286668800, 1289779200), 
               tzone = "UTC", class = c("POSIXct", "POSIXt"))), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))

nc <- ncol(df) + 1
final_df <- data.frame()

for(j in unique(df$id)){
  df1 <- df %>% filter(df$id == j)
  
  for(i in 1:nrow(df1)){
    
      df1[i, nc] <- length(unique(intersect(df1$date[df1$date <= df1$date[i]], df1$date[df1$date >= df1$date[i] %m-% years(1)])))
      
  }
  
  final_df <- rbind(final_df, df1)
}

Using lubridate(), dplyr() and making a loop使用lubridate()、dplyr()创建一个循环

Install and load lubridate and dplyr library first.首先安装并加载lubridate 和 dplyr库。
If I understood properly your question, then here is sample data and the corresponding solution to this problem.如果我正确理解了您的问题,那么这里是示例数据和该问题的相应解决方案。

df <- structure(list(id = c(3, 3, 3, 3, 3, 4, 4, 4, 5, 5), 
               date = structure(c(1220572800, 1221004800, 1269820800, 1269820800, 1274227200, 1276387200, 1279756800, 1283904000, 1286668800, 1289779200), 
               tzone = "UTC", class = c("POSIXct", "POSIXt"))), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))

nc <- ncol(df) + 1
final_df <- data.frame()

for(j in unique(df$id)){
  df1 <- df %>% filter(df$id == j)
  
  for(i in 1:nrow(df1)){
    
      df1[i, nc] <- length(unique(intersect(df1$date[df1$date <= df1$date[i]], df1$date[df1$date >= df1$date[i] %m-% years(1)])))
      
  }
  
  final_df <- rbind(final_df, df1)
}

Hope this updated code helps.. Happy coding希望这个更新的代码有所帮助..快乐编码

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM