简体   繁体   English

R和ggplot2:为重叠范围创建摘要统计信息

[英]R and ggplot2: create summary statistics for overlapping ranges

I have a set of records characterized by multiple variables, marked as either "in" or "out". 我有一组记录,这些记录具有多个变量,标记为“ in”或“ out”。 I want to plot summary statistics for all records together and for those marked "in", while plotting each point only once, colored to show which ones are "in" or "out". 我想为所有记录以及标记为“入”的记录汇总统计信息,同时只绘制每个点一次,以彩色显示哪些记录是“入”或“出”。 How can I do that? 我怎样才能做到这一点? I only know how to plot the summary statistics for the "in" and "out" groups (see code below), not for "in" and "all". 我只知道如何绘制“进”和“出”组的摘要统计信息(请参见下面的代码),而不是“进”和“所有”组的统计信息。 It would be a plus if the legend explained the colors for the points (as in my illustration) as well as the colors for the error bars. 如果图例解释了点的颜色(如我的插图)以及误差线的颜色,那将是一个加号。

library(data.table)
library(ggplot2)
d = data.table(v1 = rnorm(10, 0, 1),
              v2 = rnorm(10, 1, 2),
              g = as.factor(c(rep('in', 7), rep('out', 3))))
m = melt(d, c('g'))
print(ggplot(m, aes(x = variable, y = value, colour = g)) +
      facet_wrap(~variable, scales = "free") +
      geom_jitter(position = position_jitter(height = 0, width = 0.2)) +
      stat_summary(fun.data = mean_se, geom = "errorbar", width = 0.25))

具有“入”和“出”的摘要统计信息的图

If you want to show in and out points, but errorbars for in and total, you should move your colour command and add different stat_summary for in and all: 如果要显示入点和出点,但要显示入点和总计的误差线,则应移动颜色命令并为入和全部添加不同的stat_summary:

library(data.table)
library(reshape2) #needed because data.table::melt will only work with reshape2
library(ggplot2)
d <- data.table(v1 = rnorm(10, 0, 1),
               v2 = rnorm(10, 1, 2),
               g = as.factor(c(rep('in', 7), rep('out', 3))))

m <- melt(d, c('g'))

ggplot(m, aes(x = variable, y = value)) + # removed colour here
        facet_wrap(~variable, scales = "free") +
        geom_jitter(aes(colour = g), position = position_jitter(height = 0, width = 0.2)) + #added color here
        stat_summary(fun.data = mean_se, geom = "errorbar", width = 0.25) + #errorbars for total observations
        stat_summary(data=m[m$g == "in",], fun.data = mean_se, geom = "errorbar", width = 0.25, colour = 2) # errorbars for "in" group

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM