简体   繁体   中英

How to map a function to factor variables with group_by and summarise

I have a large data sets with a lot of categorical variables, for which I want to create summaries of.

Consider this minimal example.

library(dplyr)
library(purrr)    
set.seed(1)
dat <- data.frame(x = rep(LETTERS[1:4], times = c(2:5)),
                y = rep(letters[1:4], times = c(5:2)),
                z = rnorm(14))

I can create frequency tables using map :

dat %>% select_if(is.character) %>% map(table)

For some reasons, I would like to use dplyr to give me the frequency tables. The following code snippet work.

dat %>% group_by(x) %>% summarise(n())

But the following does not.

dat %>% select_if(is.character) %>% 
    map(function(x) group_by(x) summarise(n())

This throws the following error:

Error: unexpected symbol in "dat %>% select_if(is.character) %>% map(function(x) group_by(x) summarise"

How can I fix this error?

You are missing the pipe %>% and the data frame to be passed to group_by :

dat %>% 
    select_if(is.factor) %>% 
    map(function(x) group_by(., x) %>% summarise(n = n()))

#$x
# A tibble: 4 x 2
#  x          n
#  <fctr> <int>
#1 A          2
#2 B          3
#3 C          4
#4 D          5

#$y
# A tibble: 4 x 2
#  x          n
#  <fctr> <int>
#1 A          2
#2 B          3
#3 C          4
#4 D          5

Or better just use count :

dat %>% select_if(is.factor) %>% map(function(x) count(., x))

#$x
# A tibble: 4 x 2
#  x          n
#  <fctr> <int>
#1 A          2
#2 B          3
#3 C          4
#4 D          5

#$y
# A tibble: 4 x 2
#  x          n
#  <fctr> <int>
#1 A          2
#2 B          3
#3 C          4
#4 D          5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM