Return statistics like min or max from columns into rows with dplyr pipeline

Question

My problem is similar to this: R dplyr rowwise mean or min and other methods? Wondering if there is any dplyr functions (or combination of functions such as pivot_ etc.), that might give the desired output in a usual dplyr one-liner ?

library(tidyverse); set.seed(1); 

#Sample Data: 
sampleData <- data.frame(O = seq(1, 9, by = .1), A = rnorm(81), U = sample(1:81,
    81), I = rlnorm(81),  R = sample(c(1, 81), 81, replace = T)); #sampleData;
 
#NormalOuput:
NormalOuput <- sampleData %>% summarise_all(list(min = min, max = max)); 
NormalOuput;
#>   O_min   A_min U_min     I_min R_min O_max    A_max U_max    I_max R_max
#> 1     1 -2.2147     1 0.1970368     1     9 2.401618    81 14.27712    81

#Expected output:
ExpectedOuput <- data.frame(stats = c('min', 'max'), O = c(1, 9), A = c(-2.2147,
    2.401618), U = c(1, 81), I = c(0.1970368, 14.27712), R = c(1, 81)); 
ExpectedOuput;
#>   stats O         A  U          I  R
#> 1   min 1 -2.214700  1  0.1970368  1
#> 2   max 9  2.401618 81 14.2771200 81

^{Created on 2020-08-26 by the reprex package (v0.3.0)}

Note:

The number of columns might be huge in the real scenario, so the names cannot be called directly.

EDIT

At best, I get this:

sampleData %>% summarise(across(everything(), list(min = min, max = max))) %>% 
    t() %>% data.frame(Value = .) %>% tibble::rownames_to_column('Variables')

   Variables      Value
1      O_min  1.0000000
2      O_max  9.0000000
3      A_min -2.2146999
4      A_max  2.4016178
5      U_min  1.0000000
6      U_max 81.0000000
7      I_min  0.1970368
8      I_max 14.2771167
9      R_min  1.0000000
10     R_max 81.0000000

Answer 1

I would suggest a mix of tidyverse functions like next. You have to reshape your data, then aggregate with the summary functions you want and then as strategy you can re format again and obtain the expected output:

library(tidyverse)

sampleData %>% pivot_longer(cols = names(sampleData)) %>%
  group_by(name) %>% summarise(Min=min(value,na.rm=T),
                               Max=max(value,na.rm=T)) %>% 
  rename(var=name) %>%
  pivot_longer(cols = -var) %>%
  pivot_wider(names_from = var,values_from=value)

The output:

# A tibble: 2 x 6
  name      A      I     O     R     U
  <chr> <dbl>  <dbl> <dbl> <dbl> <dbl>
1 Min   -2.21  0.197     1     1     1
2 Max    2.40 14.3       9    81    81

Answer 2

You can use the new-ish across() to eliminate one of Duck's pivots:

sampleData %>%
  summarise(across(everything(),
                   list(min = min, max = max))) %>%
  pivot_longer(
    cols = everything(),
    names_to = c("var", "stat"),
    names_sep = "_"
  ) %>%
  pivot_wider(id_cols = "stat",
              names_from = "var")
# # A tibble: 2 x 6
#   stat      O     A     U      I     R
#   <chr> <dbl> <dbl> <dbl>  <dbl> <dbl>
# 1 min       1 -2.21     1  0.197     1
# 2 max       9  2.40    81 14.3      81

But the nicest is probably markus's suggestion in comments, which I've adapted here:

map_dfr(sampleData, function(x) c(min(x), max(x))) %>%
  mutate(stat = c("min", "max"))
# # A tibble: 2 x 6
#       O     A     U      I     R stat 
#   <dbl> <dbl> <int>  <dbl> <dbl> <chr>
# 1     1 -2.21     1  0.197     1 min  
# 2     9  2.40    81 14.3      81 max

Answer 3

While playing with pivot_longer , I discovered that this two-step one-liner also works (building on the answer by @Gregor Thomas, here only one pivot_ in stead of two or more):

sampleData %>% 
    summarise(across(everything(), list(min, max))) %>% 
        pivot_longer(everything(), names_to = c(".value", "stats"),
                     names_sep = "_")

# A tibble: 2 x 6
  stats     O     A     U      I     R
  <chr> <dbl> <dbl> <int>  <dbl> <dbl>
1 1         1 -2.21     1  0.197     1
2 2         9  2.40    81 14.3      81

More here: https://tidyr.tidyverse.org/reference/pivot_longer.html#examples

Return statistics like min or max from columns into rows with dplyr pipeline

Question

3 answers

solution1
1 ACCPTED 2020-08-26 17:52:23

solution2
1 2020-08-26 18:33:29

solution3
0 2020-08-27 16:46:27

Return statistics like min or max from columns into rows with dplyr pipeline

Question

3 answers

solution1 1 ACCPTED 2020-08-26 17:52:23

solution2 1 2020-08-26 18:33:29

solution3 0 2020-08-27 16:46:27

solution1
1 ACCPTED 2020-08-26 17:52:23

solution2
1 2020-08-26 18:33:29

solution3
0 2020-08-27 16:46:27