繁体   English   中英

在 R 中,有没有办法根据持续时间“展开”开始日期和结束日期之间的天数?

[英]In R, Is there a way to "unfold" the days between start and end dates, based on duration?

我有一个 df 指示某个观察的开始和结束日期。 通常这种观察持续时间超过一天,在“持续时间”列中给出 >0 的值。 我想将位于“开始”和“结束”(“持续时间”)之间的日期作为新行添加到我的 df 中。 我怎样才能做到这一点?

示例 df

df <- data.frame(start_date = c(as.Date("1/1/2020", "1/25/2020", "2/11/2020")),
end_date = c(as.Date("1/5/2020", "1/26/2020", "2/13/2020")),
duration = c(4, 1, 2))

您可以简单地从df$end_date中减去df$start_date

df$end_date - df$start_date
#Time differences in days
#[1] 4 1 2

或使用difftime

difftime(df$end_date, df$start_date, "days")
#Time differences in days
#[1] 4 1 2

要获取日期序列,请使用seq

do.call(c, Map(seq, df$start_date, df$end_date, by=1))
# [1] "2020-01-01" "2020-01-02" "2020-01-03" "2020-01-04" "2020-01-05"
# [6] "2020-01-25" "2020-01-26" "2020-02-11" "2020-02-12" "2020-02-13"

数据:

df <- data.frame(start_date = as.Date(c("1/1/2020", "1/25/2020", "2/11/2020"), "%m/%d/%y"),
end_date = as.Date(c("1/5/2020", "1/26/2020", "2/13/2020"), "%m/%d/%y"),
duration = c(4, 1, 2))

您是否正在寻找这样的解决方案?

library(dplyr)
library(lubridate)
df %>% 
  mutate(start_date = mdy(start_date),
         end_date = mdy(end_date)) %>% 
  mutate(duration = end_date - start_date)

数据:

df <- data.frame(start_date = c("1/1/2020", "1/25/2020", "2/11/2020"),
                 end_date = c("1/5/2020", "1/26/2020", "2/13/2020"))

Output:

  start_date   end_date duration
1 2020-01-01 2020-01-05   4 days
2 2020-01-25 2020-01-26   1 days
3 2020-02-11 2020-02-13   2 day

您在寻找这个解决方案吗?

library(tidyverse)

df %>%
  mutate(date = map2(start_date, end_date, seq, by = '1 day')) %>%
  unnest(date) -> result

result

#  start_date end_date    duration date      
#   <date>     <date>        <dbl> <date>    
# 1 2020-01-01 2020-01-05        4 2020-01-01
# 2 2020-01-01 2020-01-05        4 2020-01-02
# 3 2020-01-01 2020-01-05        4 2020-01-03
# 4 2020-01-01 2020-01-05        4 2020-01-04
# 5 2020-01-01 2020-01-05        4 2020-01-05
# 6 2020-01-25 2020-01-26        1 2020-01-25
# 7 2020-01-25 2020-01-26        1 2020-01-26
# 8 2020-02-11 2020-02-13        2 2020-02-11
# 9 2020-02-11 2020-02-13        2 2020-02-12
#10 2020-02-11 2020-02-13        2 2020-02-13

您可以使用select不需要的列。

数据

df <- structure(list(start_date = structure(c(18262, 18286, 18303),class = "Date"),
    end_date = structure(c(18266, 18287, 18305), class = "Date"), 
    duration = c(4, 1, 2)), class = "data.frame", row.names = c(NA, -3L))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM