[英]Conditional statement based on unique IDs and matching dates in R
I'm hoping someone can help with my problem, I'm new to R and can't figure my problem out.我希望有人能帮助解决我的问题,我是 R 的新手,无法解决我的问题。
I have a dataframe with multiple rows per ID, with a lot of mising data.我有一个 dataframe,每个 ID 有多行,有很多数据丢失。 I want to ask R to make a new column applying a calculation, if for each unique ID the dates match.我想要求 R 创建一个应用计算的新列,如果每个唯一 ID 的日期匹配。
An example data frame =示例数据框 =
example <- data.frame(id = c("A01","A01","A01", "A02","A02"),
al = c(14,NA,56,89,NA),
cr = c(NA,100,NA,NA,87),
date = c("2014-10-29","2014-10-29","2022-01-01", "1993-10-22", "1993-10-22"))
example$date <- as.Date(example$date)
For each unique ID (A01 and A02), if "cr" and "al" were taken on the same date, create a new column called ACR and apply this: (example$al 100)/((example$cr 0.0113)*0.01).对于每个唯一 ID(A01 和 A02),如果“cr”和“al”是在同一天拍摄的,则创建一个名为 ACR 的新列并应用此列:(example$al 100)/((example$cr 0.0113)* 0.01)。
I have tried group_by() and mutate(), but I can't figure out how to ask if two dates within the column of the ID match?我试过 group_by() 和 mutate(),但我不知道如何询问 ID 列中的两个日期是否匹配?
example2 <- example %>%
dplyr::group_by(id) %>%
dplyr::mutate(ACR = if_else(date==date), (example$al*100)/((example$cr*0.0113)*0.01), 0, NA)
Thank you so much in advance.非常感谢你提前。
#This function returns missing value if there is no value, otherwise it returns the non-missing value.
select_value <- function(x){
if(all(is.na(x))){
return(x)
}else{
return(max(x,na.rm=T))
}
}
example%>%
group_by(id,date)%>%
mutate(ACR=(select_value(al)*100)/(select_value(cr)*0.01))%>%
ungroup
id al cr date ACR
<fct> <dbl> <dbl> <date> <dbl>
1 A01 14 NA 2014-10-29 1400
2 A01 NA 100 2014-10-29 1400
3 A01 56 NA 2022-01-01 NA
4 A02 89 NA 1993-10-22 10230.
5 A02 NA 87 1993-10-22 10230.
An approach using fill
and distinct
, assuming that either one variable ( al
or cr
) is defined if they share a date.一种使用fill
和distinct
的方法,假设一个变量( al
或cr
)在共享日期时被定义。
library(dplyr)
library(tidyr)
example %>%
group_by(id, date) %>%
fill(al:cr, .direction="updown") %>%
distinct() %>%
mutate(ACR = (al * 100) / ((cr * 0.0113) * 0.01)) %>%
ungroup()
# A tibble: 3 × 5
id al cr date ACR
<chr> <dbl> <dbl> <date> <dbl>
1 A01 14 100 2014-10-29 123894.
2 A01 56 NA 2022-01-01 NA
3 A02 89 87 1993-10-22 905300.
An option with collapse::fmax
带有collapse::fmax
的选项
library(collapse)
library(dplyr)
example %>%
group_by(id, date) %>%
mutate(ACR = (fmax(al)*100)/(fmax(cr)*0.01)) %>%
ungroup
-output -输出
# A tibble: 5 × 5
id al cr date ACR
<chr> <dbl> <dbl> <date> <dbl>
1 A01 14 NA 2014-10-29 1400
2 A01 NA 100 2014-10-29 1400
3 A01 56 NA 2022-01-01 NA
4 A02 89 NA 1993-10-22 10230.
5 A02 NA 87 1993-10-22 10230.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.