繁体   English   中英

从R中有30列的数据集按组创建汇总表

[英]Create a summary table by group from a dataset with 30 columns in R

考虑我有这个示例数据:

ID <- c(1:10)
group <- c("A","A","A","B","B","B","B","B","B","B")
condition_tall <- c(0,1,1,1,1,0,0,0,1,1)
condition_long <- c(1,1,1,1,0,0,0,1,1,1)
condition_wide <- c(1,1,0,0,0,1,1,1,1,0)
check_tall <- c(1,1,1,1,1,1,0,1,0,1)
check_long <- c(1,1,1,1,1,1,0,1,0,1)
check_wide <- c(1,1,0,1,0,1,0,1,0,1)

dat <- data.frame(ID,group,condition_tall,condition_long,condition_wide,check_tall,check_long,check_wide)
dat

在 R 中生成这样的汇总表的最有效方法是什么? 我想要按组的计数和百分比,用于“条件”和“检查”。 太感谢了。

A组 B组
多变的 条件(N) 健康)状况 (%) 检查(N) 查看 (%) 条件(N) 健康)状况 (%) 检查(N) 查看 (%)
宽的
dat %>%
  group_by(group) %>%
  summarise(across(-ID, list(n=sum, pct=mean))) %>%
  pivot_longer(-group, c('name', 'var', 'name1'),names_sep = '_') %>%
  pivot_wider(var, names_from = c(group, name, name1))

结果

# A tibble: 3 x 9
  var   A_condition_n A_condition_pct A_check_n A_check_pct B_condition_n B_condition_pct B_check_n B_check_pct
  <chr>         <dbl>           <dbl>     <dbl>       <dbl>         <dbl>           <dbl>     <dbl>       <dbl>
1 tall              2           0.667         3       1                 4           0.571         5       0.714
2 long              3           1             3       1                 4           0.571         5       0.714
3 wide              2           0.667         2       0.667             4           0.571         4       0.571

另一种快速方法:

fn <- ~list(c(n=sum(.x),pct=mean(.x)))

dat %>%
  pivot_longer(-c(ID, group), c('name1', 'var'), names_sep = '_') %>%
  pivot_wider(var, names_from = c(group, name1), values_fn = fn) %>%
  unnest_wider(-var, names_sep = '_')

结果:

# A tibble: 3 x 9
  var   A_condition_n A_condition_pct A_check_n A_check_pct B_condition_n B_condition_pct B_check_n B_check_pct
  <chr>         <dbl>           <dbl>     <dbl>       <dbl>         <dbl>           <dbl>     <dbl>       <dbl>
1 tall              2           0.667         3       1                 4           0.571         5       0.714
2 long              3           1             3       1                 4           0.571         5       0.714
3 wide              2           0.667         2       0.667             4           0.571         4       0.571

您可以使用tidyverse包来重塑您的数据,计算您想要的摘要,然后将数据转回宽格式:

library(tidyverse)

wide_dat <- dat %>% 
  pivot_longer(-c(ID, group), names_sep = '_', names_to = c('metric', 'variable')) %>% 
  group_by(group, metric, variable) %>% 
  summarize(
    n = sum(value),
    pct = mean(value)
  ) %>% 
  pivot_wider(names_from = c(group, metric), values_from = c(n, pct), names_glue = '{group}_{metric}_{.value}', names_vary = 'slowest')

wide_dat

 variable A_check_n A_check_pct A_condition_n A_condition_pct B_check_n B_check_pct B_condition_n B_condition_pct
  <chr>        <dbl>       <dbl>         <dbl>           <dbl>     <dbl>       <dbl>         <dbl>           <dbl>
1 long             3       1                 3           1             5       0.714             4           0.571
2 tall             3       1                 2           0.667         5       0.714             4           0.571
3 wide             2       0.667             2           0.667         4       0.571             4           0.571

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM