I need to make a lot of boxplots for an upcoming publication. I would like to use ggplot2 because I think it will be more flexible for future projects, but my PI is insisting that I make these plots in the style of base-R. He specifically wants the dashed lines, so that they will appear similar to previous plots we made. I have made an example using the iris dataset to show you, using this code:
plot(iris$Species,
iris$Sepal.Length,
xlab='Species',
ylab='Sepal Length',
main='Sepal Variation Across Species',
col='white')
My question is how to make a similar looking plot using ggplot2?
Here is my attempt:
library("ggplot2")
ggplot(iris) +
geom_boxplot(aes(x=Species,y=Sepal.Length),linetype="dashed") +
ggtitle("Sepal Variation Across Species")
I need the combination of dashed and solid lines, but I cannot make anything work. I have already checked https://stats.stackexchange.com/questions/8137/how-to-add-horizontal-lines-to-ggplot2-boxplot which is very very close but no dashed lines, which we need. Also the outliers are filled circles, which is not the same as base-R.
To generate a "base R style" boxplot using ggplot2, we can layer 4 boxplot objects over top of one another. The order does matter here , so please keep this in mind if you modify the code. I strongly suggest that you explore this code by plotting each boxplot layer on its own; that way you can get a feel for how the different layers interact.
The ordering of the boxplots works like this (ordered from bottom to top):
I also added custom breaks to match your base R plot, which you can change depending on your needs. panel.border
is used to create a thin border in the style of base R. To get the open circles that you want, we use outlier.shape
.
The code:
library("ggplot2")
ggplot(data = iris, aes(x = Species, y = Sepal.Length)) +
geom_boxplot(linetype = "dashed", outlier.shape = 1) +
stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = 1) +
stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..)) +
stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..)) +
scale_y_continuous(breaks = seq(4.5, 8.0, 0.5)) +
labs(title = "Sepal Variation Across Species",
x = "Species",
y = "Sepal Length") +
theme_classic() + # remove panel background and gridlines
theme(plot.title = element_text(hjust = 0.5, # hjust = 0.5 centers the title
size = 14,
face = "bold"),
panel.border = element_rect(linetype = "solid",
colour = "black", fill = "NA", size = 0.5))
The plot:
Not quite exactly the same, but it seems to be a decent approximation. Hopefully this is close enough for your needs. Good luck, and happy plotting!
Here's a wrapper around @Marcus' great solution, for convenient use and more flexibility:
geom_boxplot2 <- function(mapping = NULL, data = NULL, stat = "boxplot", position = "dodge2",
..., outlier.colour = NULL, outlier.color = NULL, outlier.fill = NULL,
outlier.shape = 1, outlier.size = 1.5, outlier.stroke = 0.5,
outlier.alpha = NULL, notch = FALSE, notchwidth = 0.5, varwidth = FALSE,
na.rm = FALSE, show.legend = NA, inherit.aes = TRUE,
linetype = "dashed"){
list(
geom_boxplot(mapping = mapping, data = data, stat = stat, position = position,
outlier.colour = outlier.colour, outlier.color = outlier.color,
outlier.fill = outlier.fill, outlier.shape = outlier.shape,
outlier.size = outlier.size, outlier.stroke = outlier.stroke,
outlier.alpha = outlier.alpha, notch = notch,
notchwidth = notchwidth, varwidth = varwidth, na.rm = na.rm,
show.legend = show.legend, inherit.aes = inherit.aes,
linetype = linetype, ...),
stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = 1) ,
stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..)) ,
stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..)) ,
theme_classic(), # remove panel background and gridlines
theme(plot.title = element_text(hjust = 0.5, # hjust = 0.5 centers the title
size = 14,
face = "bold"),
panel.border = element_rect(linetype = "solid",
colour = "black", fill = "NA", size = 0.5))
)
}
ggplot(data = iris, aes(x = Species, y = Sepal.Length)) +
geom_boxplot2() +
scale_y_continuous(breaks = seq(4.5, 8.0, 0.5)) + # not sure how to generalize this
labs(title = "Sepal Variation Across Species", y = "Sepal Length")
Building further on what @Marcus & @Moody_Mudskipper has provided:
geom_boxplotMod <- function(mapping = NULL, data = NULL, stat = "boxplot",
position = "dodge2", ..., outlier.colour = NULL, outlier.color = NULL,
outlier.fill = NULL, outlier.shape = 1, outlier.size = 1.5,
outlier.stroke = 0.5, outlier.alpha = NULL, notch = FALSE, notchwidth = 0.5,
varwidth = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE,
linetype = "dashed") # to know how these come here use: args(geom_boxplot)
{
list(geom_boxplot(
mapping = mapping, data = data, stat = stat, position = position,
outlier.colour = outlier.colour, outlier.color = outlier.color,
outlier.fill = outlier.fill, outlier.shape = outlier.shape,
outlier.size = outlier.size, outlier.stroke = outlier.stroke,
outlier.alpha = outlier.alpha, notch = notch,
notchwidth = notchwidth, varwidth = varwidth, na.rm = na.rm,
show.legend = show.legend, inherit.aes = inherit.aes, linetype =
linetype, ...),
stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), width = 0.25),
#the width of the error-bar heads are decreased
stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), width = 0.25),
stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..),
outlier.shape = 1),
theme(panel.background = element_blank(),
panel.border = element_rect(size = 1.5, fill = NA),
plot.title = element_text(hjust = 0.5),
axis.title = element_text(size = 12),
axis.text = element_text(size = 10.5))
)
}
library(tidyverse); library(ggplot2);
ggplot(iris, aes(x=Species,y=Sepal.Length, colour = Species)) +
geom_boxplotMod() +
ggtitle("Sepal Variation Across Species")
Created on 2020-07-20 by the reprex package (v0.3.0)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.