简体   繁体   English

如何使用 ggplot2 制作基本的 R 风格箱线图?

[英]How to make a base R style boxplot using ggplot2?

I need to make a lot of boxplots for an upcoming publication.我需要为即将出版的出版物制作大量箱线图。 I would like to use ggplot2 because I think it will be more flexible for future projects, but my PI is insisting that I make these plots in the style of base-R.我想使用 ggplot2,因为我认为它对未来的项目会更灵活,但我的 PI 坚持我以 base-R 的风格制作这些图。 He specifically wants the dashed lines, so that they will appear similar to previous plots we made.他特别想要虚线,以便它们看起来与我们之前制作的图相似。 I have made an example using the iris dataset to show you, using this code:我已经使用 iris 数据集制作了一个示例来向您展示,使用以下代码:

plot(iris$Species,
     iris$Sepal.Length,
     xlab='Species',
     ylab='Sepal Length',
     main='Sepal Variation Across Species',
     col='white')

基础R图

My question is how to make a similar looking plot using ggplot2?我的问题是如何使用 ggplot2 制作类似的图?

Here is my attempt:这是我的尝试:

library("ggplot2")
ggplot(iris) +
  geom_boxplot(aes(x=Species,y=Sepal.Length),linetype="dashed") +
  ggtitle("Sepal Variation Across Species")

ggplot 尝试

I need the combination of dashed and solid lines, but I cannot make anything work.我需要虚线和实线的组合,但我无法做任何事情。 I have already checked https://stats.stackexchange.com/questions/8137/how-to-add-horizontal-lines-to-ggplot2-boxplot which is very very close but no dashed lines, which we need.我已经检查过https://stats.stackexchange.com/questions/8137/how-to-add-horizo​​ntal-lines-to-ggplot2-boxplot ,它非常接近但没有虚线,这是我们需要的。 Also the outliers are filled circles, which is not the same as base-R.离群值也是实心圆,这与 base-R 不同。

To generate a "base R style" boxplot using ggplot2, we can layer 4 boxplot objects over top of one another.要使用 ggplot2 生成“基本 R 风格”箱线图,我们可以将 4 个箱线图对象相互叠加。 The order does matter here , so please keep this in mind if you modify the code.顺序在这里重要,所以如果你修改代码,请记住这一点。 I strongly suggest that you explore this code by plotting each boxplot layer on its own;我强烈建议您通过单独绘制每个 boxplot 层来探索此代码; that way you can get a feel for how the different layers interact.通过这种方式,您可以了解不同层之间的交互方式。

The ordering of the boxplots works like this (ordered from bottom to top):箱线图的顺序是这样的(从下到上排序):

  • (1) vertical dashed lines are placed first (1)先放垂直虚线
  • (2) a solid box containing a median line, which covers the dashed box from (1) (2)一个包含中线的实心框,它覆盖了(1) 中的虚线框
  • (3) & (4) solid whisker lines, created by using errorbars with the minima set to the maxima, and vice versa. (3) & (4)实心须线,通过使用将最小值设置为最大值的误差线创建,反之亦然。

I also added custom breaks to match your base R plot, which you can change depending on your needs.我还添加了自定义中断以匹配您的基本 R 图,您可以根据需要进行更改。 panel.border is used to create a thin border in the style of base R. To get the open circles that you want, we use outlier.shape . panel.border用于创建基本 R 样式的细边框。为了获得您想要的空心圆,我们使用outlier.shape

The code:编码:

library("ggplot2")

ggplot(data = iris, aes(x = Species, y = Sepal.Length)) +
  geom_boxplot(linetype = "dashed", outlier.shape = 1) +
  stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = 1) +
  stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..)) +
  stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..)) +
  scale_y_continuous(breaks = seq(4.5, 8.0, 0.5)) +
  labs(title = "Sepal Variation Across Species",
       x = "Species",
       y = "Sepal Length") +
  theme_classic() + # remove panel background and gridlines
  theme(plot.title = element_text(hjust = 0.5,  # hjust = 0.5 centers the title
                                  size = 14,
                                  face = "bold"),
        panel.border = element_rect(linetype = "solid",
                                    colour = "black", fill = "NA", size = 0.5))

The plot:剧情:

在此处输入图片说明

Not quite exactly the same, but it seems to be a decent approximation.不太完全一样,但它似乎是一个体面的近似。 Hopefully this is close enough for your needs.希望这足以满足您的需求。 Good luck, and happy plotting!祝你好运,愉快的绘图!

Here's a wrapper around @Marcus' great solution, for convenient use and more flexibility:这是@Marcus 出色解决方案的包装器,以方便使用和提高灵活性:

geom_boxplot2 <- function(mapping = NULL, data = NULL, stat = "boxplot", position = "dodge2", 
                          ..., outlier.colour = NULL, outlier.color = NULL, outlier.fill = NULL, 
                          outlier.shape = 1, outlier.size = 1.5, outlier.stroke = 0.5, 
                          outlier.alpha = NULL, notch = FALSE, notchwidth = 0.5, varwidth = FALSE, 
                          na.rm = FALSE, show.legend = NA, inherit.aes = TRUE,
                          linetype = "dashed"){
  list(
    geom_boxplot(mapping = mapping, data = data, stat = stat, position = position,
                 outlier.colour = outlier.colour, outlier.color = outlier.color, 
                 outlier.fill = outlier.fill, outlier.shape = outlier.shape, 
                 outlier.size = outlier.size, outlier.stroke = outlier.stroke, 
                 outlier.alpha = outlier.alpha, notch = notch, 
                 notchwidth = notchwidth, varwidth = varwidth, na.rm = na.rm, 
                 show.legend = show.legend, inherit.aes = inherit.aes, 
                 linetype = linetype, ...),
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = 1) ,
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..)) ,
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..)) ,
    theme_classic(), # remove panel background and gridlines
    theme(plot.title = element_text(hjust = 0.5,  # hjust = 0.5 centers the title
                                    size = 14,
                                    face = "bold"),
          panel.border = element_rect(linetype = "solid",
                                      colour = "black", fill = "NA", size = 0.5))
  )
}

ggplot(data = iris, aes(x = Species, y = Sepal.Length)) +
  geom_boxplot2() +
  scale_y_continuous(breaks = seq(4.5, 8.0, 0.5)) + # not sure how to generalize this
  labs(title = "Sepal Variation Across Species", y = "Sepal Length")

Building further on what @Marcus & @Moody_Mudskipper has provided:在@Marcus 和@Moody_Mudskipper 提供的内容的基础上进一步发展:

geom_boxplotMod <- function(mapping = NULL, data = NULL, stat = "boxplot", 
    position = "dodge2", ..., outlier.colour = NULL, outlier.color = NULL, 
    outlier.fill = NULL, outlier.shape = 1, outlier.size = 1.5, 
    outlier.stroke = 0.5, outlier.alpha = NULL, notch = FALSE, notchwidth = 0.5,
    varwidth = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE,
    linetype = "dashed") # to know how these come here use: args(geom_boxplot)
    {
    list(geom_boxplot(
            mapping = mapping, data = data, stat = stat, position = position,
            outlier.colour = outlier.colour, outlier.color = outlier.color, 
            outlier.fill = outlier.fill, outlier.shape = outlier.shape, 
            outlier.size = outlier.size, outlier.stroke = outlier.stroke, 
            outlier.alpha = outlier.alpha, notch = notch, 
            notchwidth = notchwidth, varwidth = varwidth, na.rm = na.rm, 
            show.legend = show.legend, inherit.aes = inherit.aes, linetype = 
            linetype, ...),
        stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), width = 0.25),
        #the width of the error-bar heads are decreased
        stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), width = 0.25),
        stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..),
            outlier.shape = 1),
        theme(panel.background = element_blank(),
            panel.border = element_rect(size = 1.5, fill = NA),
            plot.title = element_text(hjust = 0.5),
            axis.title = element_text(size = 12),
            axis.text = element_text(size = 10.5))
        )
    }

library(tidyverse); library(ggplot2);
ggplot(iris, aes(x=Species,y=Sepal.Length, colour = Species)) +
    geom_boxplotMod() +
    ggtitle("Sepal Variation Across Species")

Created on 2020-07-20 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2020 年 7 月 20 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM