[英]R data frame - aggregating multiple columns at once
With a data frame df like below 使用如下数据框df
----------------------
a | b | c
------+-------+-------
true | true | true
false | true | false
false | false | false
true | true | false
I need to find the % of "true"
for each of the columns a, b and c as a data frame such that it can be used in a ggplot. 我需要为数据列a,b和c找到"true"
的百分比,以便可以在ggplot中使用它。 How to go about it ? 怎么做呢?
Note:- "true"
is not the logical TRUE
注: - "true"
是不是逻辑TRUE
We reshape the 'wide' to 'long' format using gather
, then find the mean
of 'true' per each 'group', and use geom_bar
to do a bar plot 我们使用gather
将“宽”格式重塑为“长”格式,然后为每个“组”找到“真”的mean
,然后使用geom_bar
进行条形图绘制
library(dplyr)
library(tidyr)
library(ggplot2)
library(scales)
gather(df1, group, value) %>%
group_by(group) %>%
summarise(perc= mean(value=="true")) %>%
ggplot(., aes(x=group, y=perc)) +
geom_bar(stat="identity") +
scale_y_continuous(labels = percent)
NOTE: Assume that the columns are character
class 注意:假定列是character
类
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.