简体   繁体   中英

ggplot2() bar chart and dplyr() grouped and overall data in R

I'd like to make a stacked proportional bar chart representing the prevalence of diabetes in a cohort of individuals residing in towns A, B, and C. I'd also like the plot to feature a bar representing the entire cohort.

I'm happy with the below plot, but I'd like to know if there is a way of incorporating the pre-processing step into the processing step, ie piping it with dplyr()?

Thanks!

Starting point (df):

dfa <- data.frame(town=c("A","A","A","B","B","C","C","C","C","C"),diabetes=c("y","y","n","n","y","n","y","n","n","y"),heartdisease=c("n","y","y","n","y","y","n","n","n","y"))

Pre-processing:

dfb <- rbind(dfa, transform(dfa, town = "ALL"))

Processing and plot:

library(dplyr)
library(ggplot)

dfc <- dfb %>%
group_by(town) %>%
count(diabetes) %>%
mutate(prop = n / sum(n))

ggplot(dfc, aes(x = town, y = prop, fill = diabetes)) +
geom_bar(stat = "identity") +
coord_flip() 

Like this:

dfc <- dfa %>%
  bind_rows(dfa %>%
              mutate(town = "ALL")) %>%
  group_by(town) %>%
  count(diabetes) %>%
  mutate(prop = n / sum(n)) %>%
  ggplot(aes(x = town, y = prop, fill = diabetes)) +
    geom_bar(stat = "identity") +
    coord_flip() 

EDIT: added pre-processing into pipeline using bind_rows and mutate instead of rbind and transform

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM