简体   繁体   English

ggplot geom_bar 按组和 facet_wrap 绘制百分比

[英]ggplot geom_bar plot percentages by group and facet_wrap

I want to plot multiple categories on a single graph, with the percentages of each category adding up to 100%.我想在一个图表上绘制多个类别,每个类别的百分比加起来为 100%。 For example, if I were plotting male versus female, each grouping (male or female), would add up to 100%.例如,如果我绘制男性与女性的图,则每个分组(男性或女性)加起来为 100%。 I'm using the following code, where the percentages appear to be for all groups on both graphs, ie if you added up all the bars on the left and right hand graphs, they would total 100%, rather than the yellow bars on the left hand graph totalling 100%, the purple bars on the left hand graph totalling 100% etc.我正在使用以下代码,其中百分比似乎适用于两个图表上的所有组,即如果您将左侧和右侧图表上的所有条形加起来,它们的总和将为 100%,而不是黄色条形上的左侧图表总计 100%,左侧图表上的紫色条总计 100% 等。

I appreciate that this is doable by using stat = 'identity', but is there a way to do this in ggplot without wrangling the dataframe prior to plotting?我很欣赏这可以通过使用 stat = 'identity' 来实现,但是有没有一种方法可以在 ggplot 中做到这一点,而无需在绘图之前对数据框进行处理?

library(ggplot2)  

tmp <- diamonds %>% filter(color %in% c("E","I")) %>% select(color, cut, clarity)

ggplot(data=tmp,
     aes(x=clarity,
         fill=cut)) + 
  geom_bar(aes(y = (..count..)/sum(..count..)), position="dodge") +
  scale_y_continuous(labels = scales::percent) + facet_wrap(vars(color))

在此处输入图片说明

When computing the percentages inside ggplot2 you have to do the grouping of the data as you would when summarizing the data before passing it to ggplot.在计算 ggplot2 中的百分比时,您必须像在将数据传递给 ggplot 之前汇总数据时那样对数据进行分组。 In your case the PANEL column added internally to the data by ggplot2 could be used for the grouping:在您的情况下,由 ggplot2 在内部添加到数据的PANEL列可用于分组:

Using after_stat and tapply this could be achieved like so:使用after_stattapply可以这样实现:

library(ggplot2)  
library(dplyr)

tmp <- diamonds %>% filter(color %in% c("E","I")) %>% select(color, cut, clarity)

ggplot(data=tmp,
       aes(x=clarity,
           fill=cut)) + 
  geom_bar(aes(y = after_stat(count/tapply(count, PANEL, sum)[PANEL])), position="dodge") +
  scale_y_continuous(labels = scales::percent) + facet_wrap(vars(color))

Or using the .. notation:或者使用..符号:

ggplot(data=tmp,
       aes(x=clarity,
           fill=cut)) + 
  geom_bar(aes(y = ..count../tapply(..count.., ..PANEL.., sum)[..PANEL..]), position="dodge") +
  scale_y_continuous(labels = scales::percent) + facet_wrap(vars(color))

EDIT If you need to group by more than one variable I would suggest to make use of a helper function, where I make use of dplyr for the computations:编辑如果您需要按多个变量分组,我建议使用辅助函数,在该函数中我使用dplyr进行计算:

comp_pct <- function(count, PANEL, cut) {
  data.frame(count, PANEL, cut) %>% 
    group_by(PANEL, cut) %>% 
    mutate(pct = count / sum(count)) %>% 
    pull(pct)
}

ggplot(data=tmp,
       aes(x=clarity,
           fill=cut)) + 
  geom_bar(aes(y = after_stat(comp_pct(count, PANEL, fill))), position="dodge") +
  scale_y_continuous(labels = scales::percent) + facet_wrap(vars(color))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM