The code below produces a sample dataset -
x = c(rep("Category1",5), rep("Category2", 8), rep("Category3", 7))
y = c(rep("No", 2), rep("I don't know", 1), rep("Yes", 2), rep("No", 2), rep("I don't know", 2), rep("Yes", 2), rep("No", 2), rep("I don't know", 3), rep("Yes", 2),rep("No", 2))
df = data.frame(x,y)
colnames(df) = c("Category", "Response")
I am trying to get percentage plots (using ggplot) of the "Response" column. So far I have been able to get a prop table of "Response" using -
total = df %>%
count(Response) %>%
mutate(prop=prop.table(n))
total
to get
Response n prop
1 I don't know 6 0.3
2 No 8 0.4
3 Yes 6 0.3
Now I would like to plot a bar chart of those percentages for Yes, No and I don't Know, and label the bars with those percentages.
Next I would like to plot the same thing, but this time grouped by "Category". I have used the code below -
df1 = df %>%
count(Category, Response) %>%
group_by(Category) %>%
mutate(prop=prop.table(n))
df1
to get this output -
Category Response n prop
<chr> <chr> <int> <dbl>
1 Category1 I don't know 1 0.2
2 Category1 No 2 0.4
3 Category1 Yes 2 0.4
4 Category2 I don't know 2 0.25
5 Category2 No 4 0.5
6 Category2 Yes 2 0.25
7 Category3 I don't know 3 0.429
8 Category3 No 2 0.286
9 Category3 Yes 2 0.286
As you can see now the prop is a percentage of the Category. I am new to ggplot and all I seem to be doing is plotting the actual counts as opposed to the percentage of Category. I have started with -
ggplot(df, aes(Category))+
geom_bar()
However I can't seem to understand how to change the aesthetics to percent.
I think you should use the proportions you have calculated in df1
and plot using geom_col
, which is equivalent to geom_bar(stat = "identity")
, meaning the bar heights will be given by the y values you pass rather than the counts.
By default, the percentages within each category will be stacked on top of each other, so you can pick them out using a fill aesthetic.
It's often helpful to write in the actual percentages too using geom_text
, but this is not everyone's cup of tea, so you can remove the geom_text
call if you like and the rest of the plot will stay the same.
Finally, since we want to show percentages rather than proportions, we use scales::percent
as shorthand way of labelling the y axis.
ggplot(df1, aes(Category, prop, fill = Response))+
geom_col() +
geom_text(aes(label = scales::percent(prop)),
position = position_stack(vjust = 0.5)) +
scale_y_continuous(labels = scales::percent)
You can change the styling, including colors, background and positions:
ggplot(df1, aes(Category, prop, fill = Response))+
geom_col(width = 0.8, position = position_dodge(width = 0.8),
color = "forestgreen") +
geom_text(aes(y = prop/2, label = scales::percent(prop)),
position = position_dodge(width = 0.8)) +
scale_y_continuous(labels = scales::percent) +
theme_classic() +
scale_fill_brewer(palette = "Greens") +
theme(panel.grid.minor = element_line(),
panel.grid.major = element_line())
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.