繁体   English   中英

r dplyr总结了多个因素计数

[英]r dplyr summarise multiple factors counts

我有一个要使用dplyr总结的数据框。 在数据框中,有多个因素,我想报告每组汇总的每个因素水平的计数。

有没有一种方法可以使用dplyr进行以下操作,而不必在summary语句中命名每个因子级别。

图书馆(dplyr)

set.seed(123)

s <- rbinom(100,1,0.5)
s <- factor(s,0:1,c('M','F'))
a <- sample(1:4,100,TRUE)
a <- factor(a,1:4,c('oldest','old','young','youngest'))
w <- rnorm(100,40,10)
g <- rep(1:2,each=50)

df <- data.frame(sex=s, age=a, weight=w, group=g)



sm <- df %>% group_by(group) %>% summarise(
  male = sum(ifelse(sex=='M',1,0))
  ,female = sum(ifelse(sex=='F',1,0))
  ,youngest = sum(ifelse(age=='youngest',1,0))
  ,young = sum(ifelse(age=='young',1,0))
  ,old = sum(ifelse(age=='old',1,0))
  ,oldest = sum(ifelse(age=='oldest',1,0))
  ,weight = mean(weight)
)

print(t(sm))

结果:

        [,1]     [,2]
group     1.000  2.00000
male     29.000 24.00000
female   21.000 26.00000
youngest 12.000  8.00000
young    13.000 17.00000
old      12.000 18.00000
oldest   13.000  7.00000
weight   37.461 40.38807

使用dplyr(尽管采用circuit回曲折的方式!):

df %>%
    mutate(row_number1 = row_number(), row_number2 = row_number()) %>%
    spread(sex, row_number1) %>%
    spread(age, row_number2) %>%
    group_by(group) %>%
    mutate_each(funs(ifelse(is.na(.), 0, 1)), -weight) %>%
    mutate(count = 1) %>%
    summarize_each(funs(sum)) %>%
    mutate(weight = weight / (count)) %>%
    select(-count) %>%
    t()

结果:

           [,1]     [,2]
group     1.000  2.00000
weight   37.461 40.38807
M        25.000 28.00000
F        25.000 22.00000
oldest   13.000  7.00000
old      12.000 18.00000
young    13.000 17.00000
youngest 12.000  8.00000

我假设对于因子来说,您想要表格,对于数字(例如weight ),您想要均值。

尽管可能无法按照您喜欢的格式格式化结果,但不使用dplyr即可完成所需的操作。

sapply(df, function(x) if (is.factor(x)) table(x, df$group) else tapply(x, df$group, mean))

您可能还需要查看reporttools包,包括tableNominaltableContinuous

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM