My problem is similar to this: R dplyr rowwise mean or min and other methods? Wondering if there is any dplyr functions (or combination of functions such as pivot_
etc.), that might give the desired output in a usual dplyr one-liner ?
library(tidyverse); set.seed(1);
#Sample Data:
sampleData <- data.frame(O = seq(1, 9, by = .1), A = rnorm(81), U = sample(1:81,
81), I = rlnorm(81), R = sample(c(1, 81), 81, replace = T)); #sampleData;
#NormalOuput:
NormalOuput <- sampleData %>% summarise_all(list(min = min, max = max));
NormalOuput;
#> O_min A_min U_min I_min R_min O_max A_max U_max I_max R_max
#> 1 1 -2.2147 1 0.1970368 1 9 2.401618 81 14.27712 81
#Expected output:
ExpectedOuput <- data.frame(stats = c('min', 'max'), O = c(1, 9), A = c(-2.2147,
2.401618), U = c(1, 81), I = c(0.1970368, 14.27712), R = c(1, 81));
ExpectedOuput;
#> stats O A U I R
#> 1 min 1 -2.214700 1 0.1970368 1
#> 2 max 9 2.401618 81 14.2771200 81
Created on 2020-08-26 by the reprex package (v0.3.0)
Note:
The number of columns might be huge in the real scenario, so the names cannot be called directly.
EDIT
At best, I get this:
sampleData %>% summarise(across(everything(), list(min = min, max = max))) %>%
t() %>% data.frame(Value = .) %>% tibble::rownames_to_column('Variables')
Variables Value
1 O_min 1.0000000
2 O_max 9.0000000
3 A_min -2.2146999
4 A_max 2.4016178
5 U_min 1.0000000
6 U_max 81.0000000
7 I_min 0.1970368
8 I_max 14.2771167
9 R_min 1.0000000
10 R_max 81.0000000
I would suggest a mix of tidyverse
functions like next. You have to reshape your data, then aggregate with the summary functions you want and then as strategy you can re format again and obtain the expected output:
library(tidyverse)
sampleData %>% pivot_longer(cols = names(sampleData)) %>%
group_by(name) %>% summarise(Min=min(value,na.rm=T),
Max=max(value,na.rm=T)) %>%
rename(var=name) %>%
pivot_longer(cols = -var) %>%
pivot_wider(names_from = var,values_from=value)
The output:
# A tibble: 2 x 6
name A I O R U
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Min -2.21 0.197 1 1 1
2 Max 2.40 14.3 9 81 81
You can use the new-ish across()
to eliminate one of Duck's pivots:
sampleData %>%
summarise(across(everything(),
list(min = min, max = max))) %>%
pivot_longer(
cols = everything(),
names_to = c("var", "stat"),
names_sep = "_"
) %>%
pivot_wider(id_cols = "stat",
names_from = "var")
# # A tibble: 2 x 6
# stat O A U I R
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 min 1 -2.21 1 0.197 1
# 2 max 9 2.40 81 14.3 81
But the nicest is probably markus's suggestion in comments, which I've adapted here:
map_dfr(sampleData, function(x) c(min(x), max(x))) %>%
mutate(stat = c("min", "max"))
# # A tibble: 2 x 6
# O A U I R stat
# <dbl> <dbl> <int> <dbl> <dbl> <chr>
# 1 1 -2.21 1 0.197 1 min
# 2 9 2.40 81 14.3 81 max
While playing with pivot_longer
, I discovered that this two-step one-liner also works (building on the answer by @Gregor Thomas, here only one pivot_
in stead of two or more):
sampleData %>%
summarise(across(everything(), list(min, max))) %>%
pivot_longer(everything(), names_to = c(".value", "stats"),
names_sep = "_")
# A tibble: 2 x 6
stats O A U I R
<chr> <dbl> <dbl> <int> <dbl> <dbl>
1 1 1 -2.21 1 0.197 1
2 2 9 2.40 81 14.3 81
More here: https://tidyr.tidyverse.org/reference/pivot_longer.html#examples
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.