简体   繁体   中英

R: calculate proportion of a factor combined with group by in a data.frame

I want to do several calculations using summarise on a dataframe using a group by. Input data:

dat <- data.frame (ID = c(1:10),
                   var1 = as.factor(c("A","B","A","A","B","B","B","C","A","B")),
                   Var2 = as.factor(c("low","medium","low","low","medium","high","high","high","high","high")))

Now I want to do a group by on var1, count the IDs and calculate the proportion where var2 = high. My output should look like this:

  var1 total prop_high
1    A     4      0.25
2    B     5      0.60
3    C     1      1.00

I got the following code so far but I get stuck on the proportion calculation

dat2 <- dat %>% 
  group_by(var1) %>%
  summarise(total = n(),
            prop_high = )

You can take mean of logical values to get proportion.

library(dplyr)

dat %>% 
  group_by(var1) %>%
  summarise(total = n(),
            prop_high = mean(Var2 == 'high'))
            #Same as
            #prop_high = sum(Var2 == 'high')/n())

#   var1  total prop_high
#  <fct> <int>     <dbl>
#1 A         4      0.25
#2 B         5      0.6 
#3 C         1      1   

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM