简体   繁体   中英

R ggplot geom_bar, include labels even if data is missing

I am trying to automate / semi automate generating graphs for monthly graphs showing the number of specific / flagged organisms in hospital wards. I am using a stacked bar chart to do this. The problem is that if I exclude the organisms from my dataset it obviously does not show in my legend. My dumb workaround is to add the organisms which were not found for the specific month and state the ward just as nothing or rather "" or empty. This way I get my legend the way I want it. The problem now is that, as you can see at the bottom of the chart, I have the ""- ward shown. This looks unprofessional. Other thing I have tried is to use geom_blank and add the labels the way I want, but then not including the organisms not tested for - it did not work. Is there a way to force a legend to be exactly what I want, irrespective of the data included or not.

library(ggplot2)

ward_stats <- read.csv("ward_stats.csv")

#specific colors for specific organisms
my_colours <- c("Acinetobacter baumannii (Carbapenem resisistant)" = "red3",
                "Pseudomonas aeruginosa (MDR)" = "gold",
                "Enterobacter cloacae (ESBL)" = "purple",
                "Enterococcus faecium (VRE)" = "violet",
                "Escherichia coli (ESBL)" = "dodgerblue1",
                "Klebsiella pneumoniae (ESBL)" = "yellowgreen",
                "Mycobacterium tuberculosis complex" = "black",
                "Staphylococcus aureus (MRSA)" = "turquoise",
                "Klebsiella pneumoniae (Carbapenem resistant)" = "grey",
                "Clostridium difficile" = "sienna4")

#A vector of organisms on the flag list in the order we want to show in the legend
targetOrder <- c("Acinetobacter baumannii (Carbapenem resisistant)", "Pseudomonas aeruginosa (MDR)", 
                 "Enterobacter cloacae (ESBL)", "Escherichia coli (ESBL)", "Klebsiella pneumoniae (ESBL)", 
                 "Klebsiella pneumoniae (Carbapenem resistant)", "Staphylococcus aureus (MRSA)", "Enterococcus faecium (VRE)",
                 "Clostridium difficile", "Mycobacterium tuberculosis complex")


p <- ggplot(data=ward_stats,aes(x=ward_stats$Ward.Name, 
                                y=ward_stats$freq,
                                fill=ward_stats$Result...Organism.Identified))
p <- p + geom_bar(stat="identity")
p <- p + geom_text(aes(y=cum_freq, label=freq), hjust= 2, color='white')
p <- p + coord_flip()
p <- p + ggtitle("Hospital")
p <- p + theme(plot.title = element_text(size=20, face="bold"))
p <- p + labs(x=NULL, y= "Number cultured per ward", vjust = -2)
p <- p + theme(axis.title.x = element_text(color="black", vjust=-2, size=12))
p <- p + theme(axis.text.x=element_text(size=10, vjust=0.5))
p <- p + theme(legend.title=element_blank())
p <- p + scale_fill_manual(values = my_colours, breaks = targetOrder)
p <- p + theme(panel.background = element_rect(fill="#e6e6ff"))

#pdf_title <- paste(graph_title,".pdf", sep="")
#ggsave("graph.pdf", plot=p, width = 10, height = 8, units = "in")

print(p)

my_graph

Look at the bottom of the graph from the link above, note there is a empty place / blank at the tick.

The Data

What I have done is quite ad hoc but it should work.

p <- ggplot(data=ward_stats,aes(x=ward_stats$Ward.Name, 
                                y=ward_stats$freq,
                                fill=ward_stats$Result...Organism.Identified))
p <- p + geom_bar(stat="identity")
p <- p + geom_text(aes(y=cum_freq, label=freq), hjust= 2, color='white')
p <- p + coord_flip(xlim = c(2, 23))
p <- p + ggtitle("Hospital")
p <- p + theme(plot.title = element_text(size=20, face="bold"))
p <- p + labs(x=NULL, y= "Number cultured per ward", vjust = -2)
p <- p + theme(axis.title.x = element_text(color="black", vjust=-2, size=12))
p <- p + theme(axis.text.x=element_text(size=10, vjust=0.5))
p <- p + theme(legend.title=element_blank())
p <- p + scale_fill_manual(values = my_colours, breaks = targetOrder)
p <- p + theme(panel.background = element_rect(fill="#e6e6ff"))
p <- p + theme(axis.ticks.y = element_line(colour = c('white', rep('black', 22))))

p

What i have done is inside the coord_flip() function I used the argument xlim and selected everything but the first element (this is done before the coord_flip).

p <- p + coord_flip(xlim = c(2, 23))

This resulted in the tick being visiable outside the plot still. This was fixed by individually setting colours for the ticks.

p <- p + theme(axis.ticks.y = element_line(colour = c('white', rep('black', 22))))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM