[英]ggplot geom_bar plot percentages by group and facet_wrap
I want to plot multiple categories on a single graph, with the percentages of each category adding up to 100%.我想在一个图表上绘制多个类别,每个类别的百分比加起来为 100%。 For example, if I were plotting male versus female, each grouping (male or female), would add up to 100%.
例如,如果我绘制男性与女性的图,则每个分组(男性或女性)加起来为 100%。 I'm using the following code, where the percentages appear to be for all groups on both graphs, ie if you added up all the bars on the left and right hand graphs, they would total 100%, rather than the yellow bars on the left hand graph totalling 100%, the purple bars on the left hand graph totalling 100% etc.
我正在使用以下代码,其中百分比似乎适用于两个图表上的所有组,即如果您将左侧和右侧图表上的所有条形加起来,它们的总和将为 100%,而不是黄色条形上的左侧图表总计 100%,左侧图表上的紫色条总计 100% 等。
I appreciate that this is doable by using stat = 'identity', but is there a way to do this in ggplot without wrangling the dataframe prior to plotting?我很欣赏这可以通过使用 stat = 'identity' 来实现,但是有没有一种方法可以在 ggplot 中做到这一点,而无需在绘图之前对数据框进行处理?
library(ggplot2)
tmp <- diamonds %>% filter(color %in% c("E","I")) %>% select(color, cut, clarity)
ggplot(data=tmp,
aes(x=clarity,
fill=cut)) +
geom_bar(aes(y = (..count..)/sum(..count..)), position="dodge") +
scale_y_continuous(labels = scales::percent) + facet_wrap(vars(color))
When computing the percentages inside ggplot2 you have to do the grouping of the data as you would when summarizing the data before passing it to ggplot.在计算 ggplot2 中的百分比时,您必须像在将数据传递给 ggplot 之前汇总数据时那样对数据进行分组。 In your case the
PANEL
column added internally to the data by ggplot2 could be used for the grouping:在您的情况下,由 ggplot2 在内部添加到数据的
PANEL
列可用于分组:
Using after_stat
and tapply
this could be achieved like so:使用
after_stat
和tapply
可以这样实现:
library(ggplot2)
library(dplyr)
tmp <- diamonds %>% filter(color %in% c("E","I")) %>% select(color, cut, clarity)
ggplot(data=tmp,
aes(x=clarity,
fill=cut)) +
geom_bar(aes(y = after_stat(count/tapply(count, PANEL, sum)[PANEL])), position="dodge") +
scale_y_continuous(labels = scales::percent) + facet_wrap(vars(color))
Or using the ..
notation:或者使用
..
符号:
ggplot(data=tmp,
aes(x=clarity,
fill=cut)) +
geom_bar(aes(y = ..count../tapply(..count.., ..PANEL.., sum)[..PANEL..]), position="dodge") +
scale_y_continuous(labels = scales::percent) + facet_wrap(vars(color))
EDIT If you need to group by more than one variable I would suggest to make use of a helper function, where I make use of dplyr
for the computations:编辑如果您需要按多个变量分组,我建议使用辅助函数,在该函数中我使用
dplyr
进行计算:
comp_pct <- function(count, PANEL, cut) {
data.frame(count, PANEL, cut) %>%
group_by(PANEL, cut) %>%
mutate(pct = count / sum(count)) %>%
pull(pct)
}
ggplot(data=tmp,
aes(x=clarity,
fill=cut)) +
geom_bar(aes(y = after_stat(comp_pct(count, PANEL, fill))), position="dodge") +
scale_y_continuous(labels = scales::percent) + facet_wrap(vars(color))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.