简体   繁体   English

如何应用多列来制作 R 中的数据框摘要列表

[英]How to apply multiple columns to make a summary list of dataframes in R

I am trying to automate this code and can't figure out how to use the apply or map functions to do it!我正在尝试自动化此代码,但不知道如何使用 apply 或 map 函数来完成它!

This is the set-up:这是设置:

data("mtcars")
count_to_pct <- function(data,..., col = n) {

  grouping_vars_expr<- quos(...)
  col_expr<- enquo(col)

  data %>%
    group_by(!!! grouping_vars_expr) %>%
    mutate(pct = (!! col_expr) / sum(!! col_expr)) %>%
    ungroup()

}

Here is where the problem comes in: repetitive code.这就是问题所在:重复代码。 Trying to clean it up for my own sanity.为了我自己的理智,试图清理它。 How can I pass a list through data %>% count(column) %>% count_to_pct() ?如何通过data %>% count(column) %>% count_to_pct()传递列表?

dataframes<- list(
  mtcars %>% count(vs) %>% count_to_pct(),                                      
  mtcars %>% count(am) %>% count_to_pct(),                                      
  mtcars %>% count(gear) %>% count_to_pct(),                                      
  mtcars %>% count(carb) %>% count_to_pct())

Get the data in long format, count , split on each name and use count_to_pct以长格式获取数据, count ,拆分每个name并使用count_to_pct

library(dplyr)
library(tidyr)
library(purrr)

mtcars %>%
  pivot_longer(cols = c(vs, am, gear, carb)) %>%
  count(name, value) %>%
  group_split(name) %>%
  map(count_to_pct)

This is actually much simpler if you don't use count_to_pct function.如果不使用count_to_pct function,这实际上要简单得多。

mtcars %>%
  pivot_longer(cols = c(vs, am, gear, carb)) %>%
  count(name, value) %>%
  group_by(name) %>%
  mutate(n = n/sum(n))


#  name  value      n
#   <chr> <dbl>  <dbl>
# 1 am        0 0.594 
# 2 am        1 0.406 
# 3 carb      1 0.219 
# 4 carb      2 0.312 
# 5 carb      3 0.0938
# 6 carb      4 0.312 
# 7 carb      6 0.0312
# 8 carb      8 0.0312
# 9 gear      3 0.469 
#10 gear      4 0.375 
#11 gear      5 0.156 
#12 vs        0 0.562 
#13 vs        1 0.438 

If you reference your column names by the character name, you can use lapply and rlang::sym to convert the character name to the column symbol that can be used inside dplyr , see here :如果您通过字符名称引用列名,则可以使用lapplyrlang::sym将字符名称转换为可在dplyr内部使用的列符号,请参见此处

dataframes_list <- lapply(c("vs", "am", "gear", "carb"), function(x) {
  mtcars %>% count(!!rlang::sym(x)) %>% count_to_pct()
})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM