简体   繁体   English

基于唯一ID和R中匹配日期的条件语句

[英]Conditional statement based on unique IDs and matching dates in R

I'm hoping someone can help with my problem, I'm new to R and can't figure my problem out.我希望有人能帮助解决我的问题,我是 R 的新手,无法解决我的问题。

I have a dataframe with multiple rows per ID, with a lot of mising data.我有一个 dataframe,每个 ID 有多行,有很多数据丢失。 I want to ask R to make a new column applying a calculation, if for each unique ID the dates match.我想要求 R 创建一个应用计算的新列,如果每个唯一 ID 的日期匹配。

An example data frame =示例数据框 =

    example <- data.frame(id = c("A01","A01","A01", "A02","A02"),
                      al = c(14,NA,56,89,NA),
                      cr = c(NA,100,NA,NA,87),                   
                      date = c("2014-10-29","2014-10-29","2022-01-01", "1993-10-22", "1993-10-22"))
    example$date <- as.Date(example$date)

For each unique ID (A01 and A02), if "cr" and "al" were taken on the same date, create a new column called ACR and apply this: (example$al 100)/((example$cr 0.0113)*0.01).对于每个唯一 ID(A01 和 A02),如果“cr”和“al”是在同一天拍摄的,则创建一个名为 ACR 的新列并应用此列:(example$al 100)/((example$cr 0.0113)* 0.01)。

I have tried group_by() and mutate(), but I can't figure out how to ask if two dates within the column of the ID match?我试过 group_by() 和 mutate(),但我不知道如何询问 ID 列中的两个日期是否匹配?

example2 <- example %>%
      dplyr::group_by(id) %>%
      dplyr::mutate(ACR = if_else(date==date), (example$al*100)/((example$cr*0.0113)*0.01), 0, NA)


Thank you so much in advance.非常感谢你提前。

#This function returns missing value if there is no value, otherwise it returns the non-missing value.
select_value <- function(x){
  if(all(is.na(x))){
    return(x)
  }else{
    return(max(x,na.rm=T))
  }
}

example%>%
  group_by(id,date)%>%
  mutate(ACR=(select_value(al)*100)/(select_value(cr)*0.01))%>%
  ungroup

  id       al    cr date          ACR
  <fct> <dbl> <dbl> <date>      <dbl>
1 A01      14    NA 2014-10-29  1400 
2 A01      NA   100 2014-10-29  1400 
3 A01      56    NA 2022-01-01    NA 
4 A02      89    NA 1993-10-22 10230.
5 A02      NA    87 1993-10-22 10230.

An approach using fill and distinct , assuming that either one variable ( al or cr ) is defined if they share a date.一种使用filldistinct的方法,假设一个变量( alcr )在共享日期时被定义。

library(dplyr)
library(tidyr)

example %>% 
  group_by(id, date) %>% 
  fill(al:cr, .direction="updown") %>% 
  distinct() %>% 
  mutate(ACR = (al * 100) / ((cr * 0.0113) * 0.01)) %>% 
  ungroup()
# A tibble: 3 × 5
  id       al    cr date           ACR
  <chr> <dbl> <dbl> <date>       <dbl>
1 A01      14   100 2014-10-29 123894.
2 A01      56    NA 2022-01-01     NA 
3 A02      89    87 1993-10-22 905300.

An option with collapse::fmax带有collapse::fmax的选项

library(collapse)
library(dplyr)
example %>% 
  group_by(id, date) %>%
  mutate(ACR = (fmax(al)*100)/(fmax(cr)*0.01)) %>%
  ungroup

-output -输出

# A tibble: 5 × 5
  id       al    cr date          ACR
  <chr> <dbl> <dbl> <date>      <dbl>
1 A01      14    NA 2014-10-29  1400 
2 A01      NA   100 2014-10-29  1400 
3 A01      56    NA 2022-01-01    NA 
4 A02      89    NA 1993-10-22 10230.
5 A02      NA    87 1993-10-22 10230.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM