简体   繁体   中英

Change count to percentage on faceted, filled geom_bar()/stat_count() plot in ggplot2 R

I have this dataset from a survey:

                         Var1                 by variable value
1           Strongly disagree  Cluster 1 (n = 9)        A     0
2           Strongly disagree Cluster 2 (n = 15)        A     0
3           Somewhat disagree  Cluster 1 (n = 9)        A     0
4           Somewhat disagree Cluster 2 (n = 15)        A     0
5  Neither agree nor disagree  Cluster 1 (n = 9)        A     2
6  Neither agree nor disagree Cluster 2 (n = 15)        A     0
7              Somewhat agree  Cluster 1 (n = 9)        A     1
8              Somewhat agree Cluster 2 (n = 15)        A     0
9              Strongly agree  Cluster 1 (n = 9)        A     6
10             Strongly agree Cluster 2 (n = 15)        A    15
11          Strongly disagree  Cluster 1 (n = 9)        B     1
12          Strongly disagree Cluster 2 (n = 15)        B     0
13          Somewhat disagree  Cluster 1 (n = 9)        B     0
14          Somewhat disagree Cluster 2 (n = 15)        B     0
15 Neither agree nor disagree  Cluster 1 (n = 9)        B     1
16 Neither agree nor disagree Cluster 2 (n = 15)        B     0
17             Somewhat agree  Cluster 1 (n = 9)        B     4
18             Somewhat agree Cluster 2 (n = 15)        B     1
19             Strongly agree  Cluster 1 (n = 9)        B     3
20             Strongly agree Cluster 2 (n = 15)        B    14
21          Strongly disagree  Cluster 1 (n = 9)        C     0
22          Strongly disagree Cluster 2 (n = 15)        C     0
23          Somewhat disagree  Cluster 1 (n = 9)        C     0
24          Somewhat disagree Cluster 2 (n = 15)        C     0
25 Neither agree nor disagree  Cluster 1 (n = 9)        C     3
26 Neither agree nor disagree Cluster 2 (n = 15)        C     0
27             Somewhat agree  Cluster 1 (n = 9)        C     1
28             Somewhat agree Cluster 2 (n = 15)        C     3
29             Strongly agree  Cluster 1 (n = 9)        C     5
30             Strongly agree Cluster 2 (n = 15)        C    12

I originally plotted it like so using ggplot2 to display the count of responses:

( p5 <- ggplot(q5, aes(x = Var1, y = value, fill = variable)) +
    geom_bar(stat = "identity", width = 0.5, position=position_dodge2(reverse = TRUE)) +
    coord_flip() +
    theme(plot.title = element_text(size = 16), axis.text.x = element_text(size = 16),
    axis.title.x = element_text(size = 16),      
    axis.title.y = element_text(size = 16),
    axis.text.y = element_text(size = 16),
    legend.text=element_text(size=16),
    legend.title=element_text(size=16),
    strip.text.x = element_text(size = 16)) +
    ylim(0,20) +
    scale_x_discrete(limits=c("Strongly disagree", "Somewhat disagree", "Neither agree nor disagree", "Somewhat agree", "Strongly agree")) +
    labs(x = "", y = "# of Responses", fill = "Question") +
    facet_grid(. ~ by) )

which gave me this:

在此处输入图片说明

However, I want to display the data as a percentage rather than count.

Following this post, I changed the code accordingly to:

( p5 <- ggplot(q5, aes(x = Var1, group = by, fill = variable)) +
    stat_count(mapping = aes(y = ..prop..)) +
    coord_flip() +
    theme(plot.title = element_text(size = 16), axis.text.x = element_text(size = 16),
    axis.title.x = element_text(size = 16),      
    axis.title.y = element_text(size = 16),
    axis.text.y = element_text(size = 16),
    legend.text=element_text(size=16),
    legend.title=element_text(size=16),
    strip.text.x = element_text(size = 16)) +
    scale_y_continuous(limits = c(0,1),labels = scales::percent_format(accuracy = 5L)) +
    scale_x_discrete(limits=c("Strongly disagree", "Somewhat disagree", "Neither agree nor disagree", "Somewhat agree", "Strongly agree")) +
    labs(x = "", y = "% of Responses", fill = "Question") +
    facet_grid(. ~ by) )

However, this gives me this plot:

在此处输入图片说明

It seems like the plot is not recognizing my fill argument or the ..prop.. argument for y.

How can I fix this?

I have problems copying-pasting the data so I make an example like your data:

set.seed(111)
df = expand.grid(Var1=c("strong disagree","disagree","strong agree","agree","neither"),
by=1:2,variable=LETTERS[1:3])
df$value=rnbinom(nrow(df),mu=5,size=0.5)
df$value[df$Var1=="disagree" & df$by==1]=0

The error you have above is trying to do stat_count with on its own group. The easier solution i think is to count the proportions first and just plot:

library(ggplot2)
library(tidyr)
library(dplyr)

df %>% group_by(by,variable) %>% 
mutate(value=replace_na(value/sum(value),0)) %>% 
ggplot(aes(x=Var1,y=value,fill=variable)) + 
geom_col(position="dodge") + facet_wrap(~by) + 
scale_y_continuous(labels = scales::percent_format()) + 
coord_flip() 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM