简体   繁体   中英

Add significance bars within and between groups in ggplot2 boxplots

If I have the following dataframe, organised by type and subtype:

df = rbind(
  data.frame(type="A", subtype="c", value=rnorm(mean=1, 100)),
  data.frame(type="B", subtype="c", value=rnorm(mean=1.5, 100)),
  data.frame(type="A", subtype="d", value=rnorm(mean=2, 100)),
  data.frame(type="B", subtype="d", value=rnorm(mean=2.5, 100))
)

And I create a box plot as per below:

p = ggplot(df, aes(x=type, y=value, color=subtype)) +
  geom_boxplot(outlier.shape = NA)

在此处输入图像描述

I know that I can add significance bars between the subtypes of each type (Ie Between A.c and Ad, and then also between B.c and Bd) by doing the following:

p + ggpubr::stat_compare_means(test = 'wilcox.test', label = 'p.signif', show.legend = F)

How can I add significance bars between select columns? Eg If I want to add a significance bar between A.c and B.c how can I do this? Is there some way that I can call the columns?

The comparisons parameter of the ggpubr::stat_compare_means function apparently takes in column indexes, but I can't figure out how to reference the subtype indexes.

I want something that looks like the following:

在此处输入图像描述

Here is a potential solution based on the ggsignif package :

library(tidyverse)
library(ggsignif)

df = rbind(
  data.frame(type="A", subtype="c", value=rnorm(mean=1, 100)),
  data.frame(type="B", subtype="c", value=rnorm(mean=1.5, 100)),
  data.frame(type="A", subtype="d", value=rnorm(mean=2, 100)),
  data.frame(type="B", subtype="d", value=rnorm(mean=2.5, 100))
)

df$interaction <- factor(interaction(df$type, df$subtype),
                         levels = c("A.c", "A.d", "B.c", "B.d"))

ggplot(df, aes(x=interaction, y=value)) +
  geom_boxplot(aes(colour = subtype), outlier.shape = NA) +
  geom_signif(comparisons = list(c("A.c", "B.c"),
                                 c("A.c", "A.d"),
                                 c("B.c", "B.d"),
                                 c("A.d", "B.d")),
              test = "wilcox.test", step_increase = 0.075,
              map_signif_level = TRUE, tip_length = 0)

示例_1.png

Edit

You can also tweak the labels to look 'cleaner' and add other geom layers, eg

library(tidyverse)
library(ggsignif)
library(ggbeeswarm)

df = rbind(
  data.frame(type="A", subtype="c", value=rnorm(mean=1, 100)),
  data.frame(type="B", subtype="c", value=rnorm(mean=1.5, 100)),
  data.frame(type="A", subtype="d", value=rnorm(mean=2, 100)),
  data.frame(type="B", subtype="d", value=rnorm(mean=2.5, 100))
)

df$Category <- factor(interaction(df$type, df$subtype),
                      levels = c("A.c", "A.d", "B.c", "B.d"),
                      labels = c("Type A\nSubtype c", "Type A\nSubtype d",
                                 "Type B\nSubtype c", "Type B\nSubtype d"))

ggplot(df, aes(x=Category, y=value)) +
  geom_boxplot(aes(colour = subtype),
               outlier.shape = NA) +
  geom_signif(comparisons = list(c("Type A\nSubtype c", "Type B\nSubtype c"),
                                 c("Type A\nSubtype c", "Type A\nSubtype d"),
                                 c("Type B\nSubtype c", "Type B\nSubtype d"),
                                 c("Type A\nSubtype d", "Type B\nSubtype d")),
              test = "wilcox.test", step_increase = 0.075,
              map_signif_level = TRUE, tip_length = 0) +
  geom_quasirandom(aes(fill = type), shape = 21, 
                groupOnX = TRUE, size = 2, alpha = 0.5) +
  scale_fill_viridis_d(option = "D", begin = 0, end = 0.3)

示例_2.png

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM