简体   繁体   中英

Applying a dplyr function to all variables at once

I have a dataframe with a numeric variable ("numeric") and several factor variables (factors 0 and 1 (FALSE, TRUE) or 0 to 4 (states in a pathology)). I would like to summarise median and IQR for "numeric" for each of the groups (0 to 1, 0 to 4).

Would there a way to apply this function to every factor column in the dataset without having to type one variable by one?

`library(dplyr)
 group_by(df, othervariable) %>%
  summarise(
  count = n(),
  median = median(numeric, na.rm = TRUE),
  IQR = IQR(numeric, na.rm = TRUE)
)`

The output:

othevariable count median   IQR
      <dbl> <int>  <dbl> <dbl>
1       0   100   2.46  2.65
2       1   207   3.88  5.86    

If your dataset contains only the grouping variables of interest and numeric , you can use purrr 's function map to apply the summarise statement to each group.

library(dplyr)

purrr::map(names(df %>% select(-numeric)), function(i) {
  df %>% 
    group_by(!!sym(i)) %>% 
    summarize(
      count = n(),
      median = median(numeric, na.rm = TRUE),
      IQR = IQR(numeric, na.rm = TRUE)
    )
})

The output should be a list of dataframes, each element corresponding to a grouping variable along with its summary result.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM