简体   繁体   中英

Adding percentages for the whole group in a stacked ggplot2 bar chart

I am trying to add group percentages via geom_text() in a stacked ggplot2 bar chart with counts on the y-axis. I have already seen and read this question over here, but I don't think it gets me the solution.

Here is a reproducible example:

library(ggplot2)
library(scales)

df <- data.frame(Var1 = rep(c("A", "B", "C"), each = 3),
                 Var2 = rep(c("Gr1", "Gr2", "Gr3"), 3),
                 Freq = c(10, 15, 5, 5, 4, 3, 2, 10, 15))

ggplot(df) + aes(x = Var2, y = Freq, fill = Var1) +
  geom_bar(stat = "identity") +
  geom_text(aes(y = ..count.., label = scales::percent(..count../sum(..count..))),
            stat = "count")

This is the result:

堆积条形图

Just to be sure you understand what I want: I want the percentage of each group Gr1, Gr2, Gr3 above each bar, summing up to 100%.

Basically, these would be the values I get when I do:

prop.table(tapply(df$Freq, df$Var2, sum))

Thanks!

I would suggest creating pre calculated data.frame . I'll do it with dplyr but you can use whatever you comfortable with:

library('dplyr')

df2 <- df %>% 
  arrange(Var2, desc(Var1)) %>% # Rearranging in stacking order      
  group_by(Var2) %>% # For each Gr in Var2 
  mutate(Freq2 = cumsum(Freq), # Calculating position of stacked Freq
         prop = 100*Freq/sum(Freq)) # Calculating proportion of Freq

df2

# A tibble: 9 x 5
# Groups:   Var2 [3]
   Var1  Var2  Freq Freq2     prop
  <chr> <chr> <dbl> <dbl>    <dbl>
1     C   Gr1     2     2 11.76471
2     B   Gr1     5     7 29.41176
3     A   Gr1    10    17 58.82353
4     C   Gr2    10    10 34.48276
5     B   Gr2     4    14 13.79310
6     A   Gr2    15    29 51.72414
7     C   Gr3    15    15 65.21739
8     B   Gr3     3    18 13.04348
9     A   Gr3     5    23 21.73913

And resulting plot:

ggplot(data = df2,
       aes(x = Var2, y = Freq,
           fill = Var1)) +
  geom_bar(stat = "identity") +
  geom_text(aes(y = Freq2 + 1,
                label = sprintf('%.2f%%', prop)))

产生的策略

Edit:

Okay, I misunderstood you a bit. But I'll use same approach - in my experience it's better to leave most of calculations out of ggplot , it'll be more predictable that way.

df %>% 
  mutate(tot = sum(Freq)) %>% 
  group_by(Var2) %>% # For each Gr in Var2 
  summarise(Freq = sum(Freq)) %>% 
  mutate(Prop = 100*Freq/sum(Freq))

ggplot(data = df,
       aes(x = Var2, y = Freq)) +
  geom_bar(stat = "identity",
           aes(fill = Var1)) +
  geom_text(data = df2,
            aes(y = Freq + 1,
                label = sprintf('%.2f%%', Prop)))

New plot: 新剧情

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM