[英]R Creating a New Column with Max/Min Dates
我有一個 dataframe,我正在嘗試根據人員 ID 和 date_diff 列的組合創建兩個新列,max_start_date 和 max_end_date。 我的問題是,一個人可能有多個日期差異。 我的例子如下:
ex <- structure(list(person_id = c(1, 1, 1, 1, 1, 1, 1, 1, 1), serv_from_dt = structure(c(18262,
18262, 18263, 18264, 18275, 18275, 18275, 18278, 18291), class = "Date"),
serv_to_dt = structure(c(18262, 18265, 18263, 18264, 18275,
18278, 18278, 18278, 18291), class = "Date"), days_diff = c(0,
3, 0, 0, 0, 3, 3, 0, 0)), row.names = c(NA,
-9L), class = c("data.table", "data.frame"))
如您所見,最小/最大日期組為:2020-01-01/2020-01-04(days_diff 為 3)、2020-01-14/2020-01-17(days_diff 為 3)和 2020 -01-30/2020-01-30(因為沒有日期與 2020-01-30 重疊)。
我想要的 output 看起來像這樣:
output <- structure(list(person_id = c(1, 1, 1, 1, 1, 1, 1, 1, 1), serv_from_dt = structure(c(18262,
18262, 18263, 18264, 18275, 18275, 18275, 18278, 18291), class = "Date"),
serv_to_dt = structure(c(18262, 18265, 18263, 18264, 18275,
18278, 18278, 18278, 18291), class = "Date"), days_diff = c(0,
3, 0, 0, 0, 3, 3, 0, 0), max_start_date = c("2020-01-01",
"2020-01-01", "2020-01-01", "2020-01-01", "2020-01-14", "2020-01-14",
"2020-01-14", "2020-01-14", "2020-01-30"), max_end_date = c("2020-01-04",
"2020-01-04", "2020-01-04", "2020-01-04", "2020-01-17", "2020-01-17",
"2020-01-17", "2020-01-17", "2020-01-30")), row.names = c(NA,
-9L), class = c("data.table", "data.frame"))
截至目前,我有
claims_sample[,days_diff := time_length(serv_to_dt-serv_from_dt, unit = 'days'), prs_nat_key][,`:=`(max_start_date =
serv_from_dt[which.max(days_diff)],
max_end_date = serv_to_dt[which.max(days_diff)]), prs_nat_key]
但這只會在整個列中重復 2020-01-01 和 2020-01-04。 我真的很感激任何關於如何解決這個問題的幫助和建議。 提前致謝!
這里給個建議:假設每組日期有4個日期!
library(dplyr)
ex %>%
group_by(person_id, x = ceiling(row_number()/4)) %>%
mutate(max_start_date = min(serv_from_dt),
max_end_date = max(serv_to_dt)
)
person_id serv_from_dt serv_to_dt days_diff x max_start_date max_end_date
<dbl> <date> <date> <dbl> <dbl> <date> <date>
1 1 2020-01-01 2020-01-01 0 1 2020-01-01 2020-01-04
2 1 2020-01-01 2020-01-04 3 1 2020-01-01 2020-01-04
3 1 2020-01-02 2020-01-02 0 1 2020-01-01 2020-01-04
4 1 2020-01-03 2020-01-03 0 1 2020-01-01 2020-01-04
5 1 2020-01-14 2020-01-14 0 2 2020-01-14 2020-01-17
6 1 2020-01-14 2020-01-17 3 2 2020-01-14 2020-01-17
7 1 2020-01-14 2020-01-17 3 2 2020-01-14 2020-01-17
8 1 2020-01-17 2020-01-17 0 2 2020-01-14 2020-01-17
9 1 2020-01-30 2020-01-30 0 3 2020-01-30 2020-01-30
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.