5 number summary of multiple variables by group in R

Question

My data are structured like this:

group     height   weight   BMI   percentile
A           15        120    19      45
B           12        115    30      12
A           14        70     25      99
C           13        100    18      87
A           15        150    35      85
C           9         98     19      14
B           12        145    21      8
B           17        127    28      55

I need a table summarizing these variables, by group:


group        min  q25  med  q75  max  mean  sd
A 
 height
 weight 
 BMI
 percentile
B
 height
 weight 
 BMI
 percentile
C
 height
 weight 
 BMI
 percentile

I can do each variable individually but it will take a long time, if anyone has an easier solution I would appreciate!

Answer 1

Here's a possible solution with the tidyverse .

library(tidyverse)

## just creating the data for the example
dat <-
  data.frame(
    group = sample(c("A", "B", "C"), replace = T, size = 100),
    height = rnorm(100),
    bmi = rnorm(100),
    percentile = rnorm(100)
  )

summary_dat <-
  dat %>%
  pivot_longer(c(height, bmi, percentile), names_to = "measure") %>%
  group_by(group, measure) %>%
  summarize(minimum = min(value),
            q25 = quantile(value, probs = 0.25),
            med = median(value),
            q75 = quantile(value, probs = 0.75),
            maximum = max(value),
            average = mean(value),
            standard_deviation = sd(value),
            .groups = "drop")

Data will look like this

summary_dat
#> # A tibble: 9 x 9
#>   group measure    minimum    q25     med   q75 maximum  average standard_deviat~
#>   <chr> <chr>        <dbl>  <dbl>   <dbl> <dbl>   <dbl>    <dbl>            <dbl>
#> 1 A     bmi          -1.64 -0.801  0.0359 0.558    1.85  0.00245            0.866
#> 2 A     height       -2.15 -0.905 -0.101  0.344    2.19 -0.233              0.865
#> 3 A     percentile   -3.70 -0.375  0.207  0.774    1.33  0.0803             1.02 
#> 4 B     bmi          -1.93 -0.318  0.125  0.943    3.01  0.284              1.01 
#> 5 B     height       -2.29 -1.25  -0.226  0.720    2.27 -0.186              1.15 
#> 6 B     percentile   -1.92 -0.161  0.441  0.760    1.80  0.222              0.775
#> 7 C     bmi          -1.71 -0.688 -0.124  0.557    2.20 -0.0784             0.948
#> 8 C     height       -1.77 -0.410  0.565  1.01     2.82  0.295              1.08 
#> 9 C     percentile   -1.30 -0.234  0.337  1.01     2.50  0.359              0.890

^{Created on 2021-12-01 by the reprex package (v2.0.1)}

5 number summary of multiple variables by group in R

Question

1 answers

solution1
0 2021-12-01 16:19:24

5 number summary of multiple variables by group in R

Question

1 answers

solution1 0 2021-12-01 16:19:24

solution1
0 2021-12-01 16:19:24