简体   繁体   中英

how to calculate proportion by another variable (not by frequency) in dplyr in R

Using mtcars data, I want to calculate proportion of mpg for each group of cyl and am. How to calc it?

mtcars %>%
   group_by(cyl, am) %>%
   summarise(mpg = n(mpg)) %>%
   mutate(mpg.gr = mpg/(sum(mpg))

Thanks in advance!

If I understand you correctly, you want the proportion of records for each combination of cyl and am . If so, then I believe your code isn't working because n() doesn't accept an argument. You also need to ungroup() before calculating your proportions.

You could simply do:

mtcars %>%
   group_by(cyl, am) %>%
   summarise(mpg = n()) %>%
   ungroup() %>%
   mutate(mpg.gr = mpg/(sum(mpg))

#> # A tibble: 6 x 4
#>     cyl    am   mpg mpg.gr
#>   <dbl> <dbl> <int>  <dbl>
#> 1     4     0     3 0.0938
#> 2     4     1     8 0.25  
#> 3     6     0     4 0.125 
#> 4     6     1     3 0.0938
#> 5     8     0    12 0.375 
#> 6     8     1     2 0.0625

Note that thanks to ungroup() , the proportions are calculated using the counts of all records, not just those within the cyl group, as before.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM