简体   繁体   中英

Return statistics like min or max from columns into rows with dplyr pipeline

My problem is similar to this: R dplyr rowwise mean or min and other methods? Wondering if there is any functions (or combination of functions such as pivot_ etc.), that might give the desired output in a usual dplyr one-liner ?

library(tidyverse); set.seed(1); 

#Sample Data: 
sampleData <- data.frame(O = seq(1, 9, by = .1), A = rnorm(81), U = sample(1:81,
    81), I = rlnorm(81),  R = sample(c(1, 81), 81, replace = T)); #sampleData;
 
#NormalOuput:
NormalOuput <- sampleData %>% summarise_all(list(min = min, max = max)); 
NormalOuput;
#>   O_min   A_min U_min     I_min R_min O_max    A_max U_max    I_max R_max
#> 1     1 -2.2147     1 0.1970368     1     9 2.401618    81 14.27712    81

#Expected output:
ExpectedOuput <- data.frame(stats = c('min', 'max'), O = c(1, 9), A = c(-2.2147,
    2.401618), U = c(1, 81), I = c(0.1970368, 14.27712), R = c(1, 81)); 
ExpectedOuput;
#>   stats O         A  U          I  R
#> 1   min 1 -2.214700  1  0.1970368  1
#> 2   max 9  2.401618 81 14.2771200 81

Created on 2020-08-26 by the reprex package (v0.3.0)

Note:

The number of columns might be huge in the real scenario, so the names cannot be called directly.

EDIT

At best, I get this:

sampleData %>% summarise(across(everything(), list(min = min, max = max))) %>% 
    t() %>% data.frame(Value = .) %>% tibble::rownames_to_column('Variables')

   Variables      Value
1      O_min  1.0000000
2      O_max  9.0000000
3      A_min -2.2146999
4      A_max  2.4016178
5      U_min  1.0000000
6      U_max 81.0000000
7      I_min  0.1970368
8      I_max 14.2771167
9      R_min  1.0000000
10     R_max 81.0000000

I would suggest a mix of tidyverse functions like next. You have to reshape your data, then aggregate with the summary functions you want and then as strategy you can re format again and obtain the expected output:

library(tidyverse)

sampleData %>% pivot_longer(cols = names(sampleData)) %>%
  group_by(name) %>% summarise(Min=min(value,na.rm=T),
                               Max=max(value,na.rm=T)) %>% 
  rename(var=name) %>%
  pivot_longer(cols = -var) %>%
  pivot_wider(names_from = var,values_from=value)

The output:

# A tibble: 2 x 6
  name      A      I     O     R     U
  <chr> <dbl>  <dbl> <dbl> <dbl> <dbl>
1 Min   -2.21  0.197     1     1     1
2 Max    2.40 14.3       9    81    81

You can use the new-ish across() to eliminate one of Duck's pivots:

sampleData %>%
  summarise(across(everything(),
                   list(min = min, max = max))) %>%
  pivot_longer(
    cols = everything(),
    names_to = c("var", "stat"),
    names_sep = "_"
  ) %>%
  pivot_wider(id_cols = "stat",
              names_from = "var")
# # A tibble: 2 x 6
#   stat      O     A     U      I     R
#   <chr> <dbl> <dbl> <dbl>  <dbl> <dbl>
# 1 min       1 -2.21     1  0.197     1
# 2 max       9  2.40    81 14.3      81

But the nicest is probably markus's suggestion in comments, which I've adapted here:

map_dfr(sampleData, function(x) c(min(x), max(x))) %>%
  mutate(stat = c("min", "max"))
# # A tibble: 2 x 6
#       O     A     U      I     R stat 
#   <dbl> <dbl> <int>  <dbl> <dbl> <chr>
# 1     1 -2.21     1  0.197     1 min  
# 2     9  2.40    81 14.3      81 max

While playing with pivot_longer , I discovered that this two-step one-liner also works (building on the answer by @Gregor Thomas, here only one pivot_ in stead of two or more):

sampleData %>% 
    summarise(across(everything(), list(min, max))) %>% 
        pivot_longer(everything(), names_to = c(".value", "stats"),
                     names_sep = "_")

# A tibble: 2 x 6
  stats     O     A     U      I     R
  <chr> <dbl> <dbl> <int>  <dbl> <dbl>
1 1         1 -2.21     1  0.197     1
2 2         9  2.40    81 14.3      81

More here: https://tidyr.tidyverse.org/reference/pivot_longer.html#examples

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM