[英]Dplyr - choosing value in column based on lowest value in other column in R
我目前正在处理一个数据集,每个患者 ID 都有多个活检。 我需要找到最接近特定日期的活检结果(每个患者个人)。 下面可以看到一个虚拟数据集
df <- data.frame(m1 = c("1","1","1","2","2","2"),
patodate=c("2013-06-03","2014-01-06","2018-11-23","2004-03-03","2018-06-25","2018-12-19"),
baselinedate=c("2018-11-09","2018-11-09","2018-11-09","2018-07-24","2018-07-24","2018-07-24"),
biopsy=c("1","2","3","1","2","3"))
然后我计算了 patodate 和 baselinedate 之间的时间差
df$patodate <- as.Date(df$patodate)
df$baselinedate <- as.Date(df$baselinedate)
df <- df%>%
group_by(m1) %>%
mutate(diff = baselinedate-recdate)
我现在的问题是 - 我想添加一个名为“状态”的新列,它显示(按组 m1)时间差最接近 0 的“活检”结果。最终结果将是
df <- data.frame(m1 = c("1","1","1","2","2","2"),
patodate=c("2013-06-03","2014-01-06","2018-11-23","2004-03-03","2018-06-25","2018-12-19"),
baselinedate=c("2018-11-09","2018-11-09","2018-11-09","2018-07-24","2018-07-24","2018-07-24"),
biopsy=c("1","2","3","1","2","3"),
status=c("3","3","3","2","2","2"))
我希望有人理解这个问题并能够提供帮助。 非常感谢
亲切的问候,
托拜厄斯伯格
您可以获得每组日期之间差异的最小绝对值索引。
library(dplyr)
df %>%
group_by(m1) %>%
mutate(status = which.min(abs(patodate - baselinedate))) %>%
ungroup
# m1 patodate baselinedate biopsy status
# <chr> <date> <date> <chr> <int>
#1 1 2013-06-03 2018-11-09 1 3
#2 1 2014-01-06 2018-11-09 2 3
#3 1 2018-11-23 2018-11-09 3 3
#4 2 2004-03-03 2018-07-24 1 2
#5 2 2018-06-25 2018-07-24 2 2
#6 2 2018-12-19 2018-07-24 3 2
这是另一种方法:
library(dplyr)
library(lubridate)
df %>%
group_by(m1) %>%
mutate(across(contains("date"), ymd),
helper = abs(difftime(baselinedate,patodate))) %>%
mutate(status = biopsy[helper==min(helper)]) %>%
select(-helper)
m1 patodate baselinedate biopsy status
<chr> <date> <date> <chr> <chr>
1 1 2013-06-03 2018-11-09 1 3
2 1 2014-01-06 2018-11-09 2 3
3 1 2018-11-23 2018-11-09 3 3
4 2 2004-03-03 2018-07-24 1 2
5 2 2018-06-25 2018-07-24 2 2
6 2 2018-12-19 2018-07-24 3 2
我们可能会做
library(dplyr)
df %>%
group_by(m1) %>%
mutate(status = abs(patodate - baselinedate),
status = which(status == min(status))[1]) %>%
ungroup
-输出
# A tibble: 6 × 5
m1 patodate baselinedate biopsy status
<chr> <date> <date> <chr> <int>
1 1 2013-06-03 2018-11-09 1 3
2 1 2014-01-06 2018-11-09 2 3
3 1 2018-11-23 2018-11-09 3 3
4 2 2004-03-03 2018-07-24 1 2
5 2 2018-06-25 2018-07-24 2 2
6 2 2018-12-19 2018-07-24 3 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.