简体   繁体   English

在 R 如何按组(id)计算两列日期之间的差异,但同时保留第一个可用日期作为参考

[英]In R how to calculate the difference between two columns dates by group (id) but whilst keeping the first available date as reference

How to compute the time between two columns dates but keeping the first or earliest date as reference, by group.如何按组计算两列日期之间的时间,但保留第一个或最早日期作为参考。 For example the id N02 , the reference date_1 should remains 2009-07-10 until the next id .例如id N02 ,引用 date_1 应该保持2009-07-10直到下一个id I think that I am close but I can't succeed in finding the right solution.我认为我很接近,但我无法成功找到正确的解决方案。

Please find below a minimal working example:请在下面找到一个最小的工作示例:

id <- c("N02", "N02", "N03", "N03", "N04", "N04", "N04", "N04", "N04", "N04")
date_1 <- c ("2008-03-15", "2008-04-15", "2008-06-15", "2008-07-15", "2009-07-10", "2009-07-13", "2009-07-15", "2009-07-16", "2009-07-17", "2009-07-20")
date_2 <- c ("2008-03-15", "2008-04-15", "2008-06-15", "2008-07-15", "2009-07-10", "2009-07-13", "2009-07-15", "2009-07-16", "2009-07-17", "2009-07-20")
df1 <- data.frame (id, date_1, date_2)
> df1
    id     date_1     date_2
1  N02 2008-03-15 2008-03-15
2  N02 2008-04-15 2008-04-15
3  N03 2008-06-15 2008-06-15
4  N03 2008-07-15 2008-07-15
5  N04 2009-07-10 2009-07-10
6  N04 2009-07-13 2009-07-13
7  N04 2009-07-15 2009-07-15
8  N04 2009-07-16 2009-07-16
9  N04 2009-07-17 2009-07-17
10 N04 2009-07-20 2009-07-20

My failed attempt:我失败的尝试:

df2 <- df1 %>% group_by (id) %>% mutate (diff = difftime (date_2, lag (date_1, default = date_1[1]), unit = "day"))
> df2
# A tibble: 10 × 4
# Groups:   id [3]
   id    date_1     date_2     diff         
   <chr> <chr>      <chr>      <drtn>       
 1 N02   2008-03-15 2008-03-15  0.00000 days
 2 N02   2008-04-15 2008-04-15 30.95833 days
 3 N03   2008-06-15 2008-06-15  0.00000 days
 4 N03   2008-07-15 2008-07-15 30.00000 days
 5 N04   2009-07-10 2009-07-10  0.00000 days
 6 N04   2009-07-13 2009-07-13  3.00000 days
 7 N04   2009-07-15 2009-07-15  2.00000 days
 8 N04   2009-07-16 2009-07-16  1.00000 days
 9 N04   2009-07-17 2009-07-17  1.00000 days
10 N04   2009-07-20 2009-07-20  3.00000 days

However I would like something like this:但是我想要这样的东西:

id <- c("N02", "N02", "N03", "N03", "N04", "N04", "N04", "N04", "N04", "N04")
date_1 <- c ("2008-03-15", "2008-04-15", "2008-06-15", "2008-07-15", "2009-07-10", "2009-07-13", "2009-07-15", "2009-07-16", "2009-07-17", "2009-07-20")
date_2 <- c ("2008-03-15", "2008-04-15", "2008-06-15", "2008-07-15", "2009-07-10", "2009-07-13", "2009-07-15", "2009-07-16", "2009-07-17", "2009-07-20")
diff <- c("0.00000 days", "30.95833 days", "0.00000 days", "30.00000 days", "0.00000 days", "3.00000 days", "5.00000 days", "6.00000 days", "7.00000 days", "10.0000 days")
df2 <- data.frame (id, date_1, date_2, diff)
> df2
    id     date_1     date_2          diff
1  N02 2008-03-15 2008-03-15  0.00000 days
2  N02 2008-04-15 2008-04-15 30.95833 days
3  N03 2008-06-15 2008-06-15  0.00000 days
4  N03 2008-07-15 2008-07-15 30.00000 days
5  N04 2009-07-10 2009-07-10  0.00000 days
6  N04 2009-07-13 2009-07-13  3.00000 days
7  N04 2009-07-15 2009-07-15  5.00000 days
8  N04 2009-07-16 2009-07-16  6.00000 days
9  N04 2009-07-17 2009-07-17  7.00000 days
10 N04 2009-07-20 2009-07-20  10.0000 days

Thank you in advance for your help.预先感谢您的帮助。 Charles查尔斯

You were almost there - just use [[1]] (or dplyr::first() ) instead of lag() :你几乎就在那里 - 只需使用[[1]] (或dplyr::first() )而不是lag()

library(dplyr)

df1 %>%
  group_by(id) %>%
  mutate(diff = difftime(date_2, date_1[[1]], unit = "day")) %>%
  ungroup()
# A tibble: 10 × 4
   id    date_1     date_2     diff   
   <chr> <chr>      <chr>      <drtn> 
 1 N02   2008-03-15 2008-03-15  0 days
 2 N02   2008-04-15 2008-04-15 31 days
 3 N03   2008-06-15 2008-06-15  0 days
 4 N03   2008-07-15 2008-07-15 30 days
 5 N04   2009-07-10 2009-07-10  0 days
 6 N04   2009-07-13 2009-07-13  3 days
 7 N04   2009-07-15 2009-07-15  5 days
 8 N04   2009-07-16 2009-07-16  6 days
 9 N04   2009-07-17 2009-07-17  7 days
10 N04   2009-07-20 2009-07-20 10 days

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM