简体   繁体   English

使用 R 中的 ggplot 将汇总统计标签添加到框 plot

[英]Adding summary statistics labels to box plot using ggplot in R

I am trying to add labels to sit above box plots.我正在尝试添加标签以位于箱形图上方。 For example, in this example, instead of NA, I would want the label above A to say "total number of var3 = 11" and over B "total number of var3 = 34".例如,在此示例中,我希望 A 上方的 label 说“var3 的总数 = 11”和 B 上方的“var3 的总数 = 34”,而不是 NA。 In my real data, numbers are produced, but they bear no relation to the original data set (I cannot work out how they could possibly be calculated from the original data, so I must be doing something wrong.).在我的真实数据中,产生了数字,但它们与原始数据集无关(我无法弄清楚如何从原始数据中计算出它们,所以我一定做错了。)。

var1<- c("A", "B", "A", "B", "B", "B", "A", "B", "B")
var2<- as.numeric(c(4:12))
var3<- as.numeric(c(1:9))

df<- data.frame(var1, var2, var3)

stat_box_data <- function(y, upper_limit = max(df$var2) * 1.15 ) {
  return( 
    data.frame(
      y = 0.95* upper_limit,
      label = paste('number of var1 =', length(y), '\n', 
                    'total number of var3 =', sum(df$var3[y])
      )
    )
  )
}

ggplot(df, aes(var1, var2)) + 
  geom_boxplot() +
  stat_summary(    fun.data = stat_box_data, 
                   geom = "text", 
                   hjust = 0.5,
                   vjust = 0.9)

df%>% group_by (var1) %>% summarise (sum = sum(var3))

Link to graph链接到图表

Thanks to original post for code here https://gscheithauer.medium.com/how-to-add-number-of-observations-to-a-ggplot2-boxplot-b22710f7ef80感谢这里的代码原帖https://gscheithauer.medium.com/how-to-add-number-of-observations-to-a-ggplot2-boxplot-b22710f7ef80

I was you can sort of manually input the data is maybe using an ifelse statement我是你可以手动输入数据可能是使用ifelse语句

 stat_box_data <- function(y, upper_limit = max(df$var2) * 1.15, y2 = df[c(1,3)]) {
    return( 
      data.frame(
        y = 0.95* upper_limit,
        label = paste('number of var1 =', length(y), '\n', 
                      'total number of var3 =', ifelse(length(y)>3, 34, 11) , '\n'  
        )
      )
    )
  }
  
  ggplot(df, aes(var1, var2)) + 
    geom_boxplot() +
    stat_summary(    fun.data = stat_box_data, 
                     geom = "text", 
                     hjust = 0.5,
                     vjust = 0.9)

例子

You can automate this a little bit using this你可以用这个自动化一点

  group1 <- df%>%
    filter(var1 == "A")
  group2 <- df %>%
    filter(var1 == "B")
  
  stat_box_data <- function(y, upper_limit = max(df$var2) * 1.15, y2 = df[c(1,3)]) {
    return( 
      data.frame(
        y = 0.95* upper_limit,
        label = paste('number of var1 =', length(y), '\n', 
                      'total number of var3 =', ifelse(length(y)>3, sum(group2$var3), sum(group1$var3)) , '\n'  
        )
      )
    )
  }
  
  ggplot(df, aes(var1, var2)) + 
    geom_boxplot() +
    stat_summary(    fun.data = stat_box_data, 
                     geom = "text", 
                     hjust = 0.5,
                     vjust = 0.9)

You could get the result you want using this rather convoluted method.您可以使用这种相当复杂的方法获得您想要的结果。

library(dplyr)
library(ggplot2)
var1<- c("A", "B", "A", "B", "B", "B", "A", "B", "B")
var2<- as.numeric(c(4:12))
var3<- as.numeric(c(1:9))

df<- data.frame(var1, var2, var3)

stat_box_data <- function(y,  upper_limit = max(df$var2) * 1.15) {
  
  return( 
    data.frame(
      y = 0.95* upper_limit,label = paste('count =', length(y), '\n',
                                          'mean =', sum(df$var3[match(y, df$var2)]), '\n'
      )
    )
  )
}

d<-df%>% group_by (var1) %>% summarise (sum = sum(var3)) %>% pull(sum)

ggplot(df, aes(var1, var2)) + 
  geom_boxplot() +
  stat_summary(fun.data = stat_box_data,
                   geom = "text", 
                   hjust = 0.5,
                   vjust = 0.9)

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM