If I have a simple data frame with 2 factors (a and b) with 2 levels (1 and 2) and 1 variable (x), how do I get the median values of x: median x over each level of factor a, each level of factor b, and each combination of a*b?
library(dplyr)
df <- data.frame(a = as.factor(c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2)),
b = as.factor(c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2)),
x = c(runif(16)))
I've tried various (many) versions of:
df %>%
group_by_(c("a", "b")) %>%
summarize(med_rate = median(df$x))
The results should look like this for the median x of each level of factor a:
a median
1 0.58811
2 0.53167
And like this for the median x of each level of factor b:
b median
1 0.60622
2 0.46096
And like this for the median x for each combinations of a and b:
ab median
1 1 0.66745
1 2 0.34656
2 1 0.50903
2 2 0.55990
Thanks in advance for any help.
set.seed(123) ##make your example reproducible
require(data.table)
df <- data.table(a = as.factor(c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2)),
b = as.factor(c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2)),
x = c(runif(16)))
df[, median(x), by = a]
df[, median(x), by = b]
df[, median(x), by = .(a,b)]
The following is not very elegant but creates a single data.frame
that meets your expected result.
We are creating three data data.frames
(for a, b and a*b) and combining them into one.
bind_rows(
df %>%
group_by(a) %>%
rename(factor_g = a) %>%
summarize(med_rate = median(x)),
df %>%
group_by(b) %>%
rename(factor = b) %>%
summarize(med_rate = median(x)),
df %>%
# We create a column for grouping a*b
mutate(factor = paste(a, b)) %>%
group_by(factor) %>%
summarize(med_rate = median(x))
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.