简体   繁体   中英

R data frame - aggregating multiple columns at once

With a data frame df like below

----------------------
a     |  b    |  c 
------+-------+-------
true  | true  | true
false | true  | false
false | false | false
true  | true  | false

I need to find the % of "true" for each of the columns a, b and c as a data frame such that it can be used in a ggplot. How to go about it ?

Note:- "true" is not the logical TRUE

We reshape the 'wide' to 'long' format using gather , then find the mean of 'true' per each 'group', and use geom_bar to do a bar plot

library(dplyr)
library(tidyr)
library(ggplot2)
library(scales)
gather(df1, group, value) %>% 
       group_by(group) %>% 
       summarise(perc= mean(value=="true")) %>% 
       ggplot(., aes(x=group, y=perc)) + 
               geom_bar(stat="identity") +
               scale_y_continuous(labels = percent)

NOTE: Assume that the columns are character class

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM