简体   繁体   中英

User defined function in R with dplyr

I have a dataframe and try to create a function that calculate number of records by TRT01AN and another variable chosen by the user (I just send a reduced DF with only one extra variable to make it simpler)

dataframe <- as.data.frame(cbind(ID,=c(1,2,3,4,5,6),TRT01AN = c(1, 1, 3, 2, 2, 2),
                                 AGEGR1 =c("Adult","Child","Adolescent","Adolescent","Adolescent","Child")))



sub1 <- function(SUB1) {
    # Calculate number of subjects in each treatment arm
    bigN1 <- dataframe %>% 
      group_by_(SUB1,TRT01AN) %>% 
      summarise(N = n_distinct(ID))
    return(bigN1)
    
}
bigN1<-sub1(SUB1="AGEGR1")


If I do that, with group_by_ I have an error that TRT01AN doesn't exist and if I use group_by, SUB1 can't be found... Any idea how I can have both variables, a "permanent" one and on defined as the argument of the function? Thank you!

Try using curly braces (works with or without quotation marks in function call):

library(dplyr)

dataframe <-
  as.data.frame(cbind(
    ID = c(1, 2, 3, 4, 5, 6),
    TRT01AN = c(1, 1, 3, 2, 2, 2),
    AGEGR1 = c(
      "Adult",
      "Child",
      "Adolescent",
      "Adolescent",
      "Adolescent",
      "Child"
    )
  ))



sub1 <- function(SUB1) {
  # Calculate number of subjects in each treatment arm
  bigN1 <- dataframe %>%
    group_by({{SUB1}}, TRT01AN) %>%
    summarise(N = n_distinct(ID))
  return(bigN1)
  
}

bigN1 <- sub1(AGEGR1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM