简体   繁体   English

R ggplot2:在多个变量的相同图中添加均值和标准差

[英]R ggplot2: add mean and standard deviation in same plot for multiple variables

I have 4 vectors of 16 values each; 我有4个向量,每个向量16个值; each values is the mean of an item, and I have the same 16 items across 4 datasets. 每个值都是一个项目的平均值,我在4个数据集中有相同的16个项目。

I use ggplot2 to plot these means: here a reproducible example. 我使用ggplot2绘制这些方法:这里是一个可重复的例子。

library("ggplot2")
library("dplyr")    

means <- as.data.frame(cbind(rnorm(16),rnorm(16), rnorm(16), rnorm(16)))
means <- mutate(means, id = rownames(means))
colnames(means)<-c("1", "2", "3", "4", "Symptoms")
means_long <- melt(means, id="Symptoms")
means_long$Symptoms <- as.numeric(means_long$Symptoms)
names(means_long)[2] <- "Datasets"

ggplot(data=means_long, aes(x=Symptoms, y=value, colour=Datasets)) +
      geom_line() +
      geom_point(shape = 21, fill = "white", size = 1.5, stroke = 1) +
      xlab("Symptoms") + ylab("Means") +
      scale_y_continuous() + 
      scale_x_continuous(breaks=c(1:16)) +
      theme_bw() +
      theme(panel.grid.minor=element_blank()) +
      coord_flip()

Now, I have 4 further vectors, which are the standard deviations of the 16 items for the 4 datasets. 现在,我有4个其他向量,这是4个数据集的16个项目的标准偏差。 I want to plot them into the same plot. 我想将它们绘制成相同的情节。 The data are in the same format as above, so it's virtually the same code. 数据格式与上面相同,因此它实际上是相同的代码。

I want the standard deviations in the same plot as the means, using the same colors but different line types (so dataset 1 mean is red, dataset 1 standard deviation is dashed), and in the best case a legend that differentiates both by dataset (as I have currently) in addition to "mean" vs "standard deviation" for the lines and dashed lines. 我希望同一图中的标准偏差与平均值相同,使用相同颜色但不同的线类型(因此数据集1均值为红色,数据集1标准差为虚线),最好的情况是通过数据集区分两者的图例(正如我目前所做的那样)除了线条和虚线的“均值”与“标准偏差”之外。

Thank you for your help! 谢谢您的帮助!

Does this help? 这有帮助吗?

To make it not look super ugly, I made all the random mean values positive, and then just made the example standard deviations negative. 为了使它看起来不那么丑陋,我将所有随机平均值都设为正值,然后只是将示例标准偏差设为负值。 The way of plotting the values on the same graph is to feed in the datasets separately to each geom, rather than defining in initial ggplot() function. 在同一图表上绘制值的方法是将数据集分别输入到每个geom,而不是在初始ggplot()函数中定义。

Let me know if this isn't what you were thinking: 如果这不是您的想法,请告诉我:

library("ggplot2")
library("dplyr")    

means <- as.data.frame(abs(cbind(rnorm(16),rnorm(16), rnorm(16), rnorm(16))))
means <- mutate(means, id = rownames(means))
colnames(means)<-c("1", "2", "3", "4", "Symptoms")
means_long <- reshape2::melt(means, id="Symptoms")
means_long$Symptoms <- as.numeric(means_long$Symptoms)
names(means_long)[2] <- "Datasets"


sds_long <- means_long
sds_long$value <- -sds_long$value

ggplot() +
  geom_line(aes(x=Symptoms, y=value, colour=Datasets), lty=1, data=means_long) +
  geom_point(aes(x=Symptoms, y=value, colour=Datasets), data=means_long, shape = 21, fill = "white", size = 1.5, stroke = 1) +
  geom_line(aes(x=Symptoms, y=value, colour=Datasets), lty=2, data=sds_long) +
  geom_point(  aes(x=Symptoms, y=value, colour=Datasets), data=sds_long, shape = 21, fill = "white", size = 1.5, stroke = 1) +
  xlab("Symptoms") + ylab("Means") +
  scale_y_continuous() + 
  scale_x_continuous(breaks=c(1:16)) +
  theme_bw() +
  theme(panel.grid.minor=element_blank()) +
  coord_flip()

在此输入图像描述

#

To answer your legend query. 要回答您的图例查询。 In short, I think this is very hard because the same mapping aesthetic is being used with both datasets. 简而言之,我认为这非常困难,因为两个数据集都使用相同的映射美学。

However, using the code from this answer - I did the following. 但是,使用这个答案代码 - 我做了以下。 The idea is to get the legend from two plots only plotting means/sds and then adding those legends to a version of the plot with no legend. 我们的想法是从两个情节中获取图例,只绘制平均值/ sds,然后将这些图例添加到没有图例的情节版本中。 It could be adapted so you position the legends more manually... 它可以调整,以便您更手动地定位图例...

 ### Step 1 # Draw a plot with the colour legend p1 <- ggplot() + geom_line(aes(x=Symptoms, y=value, colour=Datasets), lty=1, data=means_long) + geom_point(aes(x=Symptoms, y=value, colour=Datasets), data=means_long, shape = 21, fill = "white", size = 1.5, stroke = 1) + scale_color_manual(name = "Means",values=c("red","blue", "green","pink")) + coord_flip()+ theme_bw() + theme(panel.grid.minor=element_blank()) + theme(legend.position = "top") # Extract the colour legend - leg1 library(gtable) leg1 <- gtable_filter(ggplot_gtable(ggplot_build(p1)), "guide-box") ### Step 2 # Draw a plot with the size legend p2 <- ggplot() + geom_line(aes(x=Symptoms, y=value, color=Datasets), lty=2, data=sds_long) + geom_point( aes(x=Symptoms, y=value, color=Datasets), data=sds_long, shape = 21, fill = "white", size = 1.5, stroke = 1) + coord_flip()+ theme_bw() + theme(panel.grid.minor=element_blank()) + scale_color_manual(name = "SDs",values=c("red","blue", "green","pink")) # Extract the size legend - leg2 leg2 <- gtable_filter(ggplot_gtable(ggplot_build(p2)), "guide-box") # Step 3 # Draw a plot with no legends - plot p3<-ggplot() + geom_line(aes(x=Symptoms, y=value, colour=Datasets), lty=1, data=means_long) + geom_point(aes(x=Symptoms, y=value, colour=Datasets), data=means_long, shape = 21, fill = "white", size = 1.5, stroke = 1) + geom_line(aes(x=Symptoms, y=value, color=Datasets), lty=2, data=sds_long) + geom_point( aes(x=Symptoms, y=value, color=Datasets), data=sds_long, shape = 21, fill = "white", size = 1.5, stroke = 1) + xlab("Symptoms") + ylab("Means") + scale_y_continuous() + scale_x_continuous(breaks=c(1:16)) + theme_bw() + theme(panel.grid.minor=element_blank()) + coord_flip()+ scale_color_manual(values=c("red","blue", "green","pink")) + theme(legend.position = "none") ### Step 4 # Arrange the three components (plot, leg1, leg2) # The two legends are positioned outside the plot: # one at the top and the other to the side. library(grid) plotNew <- arrangeGrob(leg1, p3, heights = unit.c(leg1$height, unit(1, "npc") - leg1$height), ncol = 1) plotNew <- arrangeGrob(plotNew, leg2, widths = unit.c(unit(1, "npc") - leg2$width, leg2$width), nrow = 1) grid.newpage() grid.draw(plotNew) 

在此输入图像描述

I suggest an implementation that you only need 1 single dataframe to plot. 我建议你只需要一个数据帧来实现一个实现。 Plus you don't need to tweak your code much, but you are still be able to distinguish datasets (ie, 1, 2, 3, 4) and types of values (eg, mean, sd). 此外,您不需要对代码进行太多调整,但您仍然可以区分数据集(即1,2,3,4)和值类型(例如,均值,sd)。

library("ggplot2")
library("dplyr")    

# Means
means <- as.data.frame(cbind(rnorm(16),rnorm(16), rnorm(16), rnorm(16)))
means <- mutate(means, id = rownames(means))
colnames(means)<-c("1", "2", "3", "4", "Symptoms")
means_long <- melt(means, id="Symptoms")
means_long$Symptoms <- as.numeric(means_long$Symptoms)
names(means_long)[2] <- "Datasets"

# Sd
sds_long <- means_long
sds_long$value <- -sds_long$value

################################################################################
# Add "Type" column to distinguish means and sds
################################################################################
type <- c("Mean")
means_long <- cbind(means_long, type)

type <- c("Sd")
sds_long <- cbind(sds_long, type)

merged <- rbind(means_long, sds_long)

colnames(merged)[4] <- "Type"

################################################################################
# Plot
################################################################################
ggplot(data = merged) +
  geom_line(aes(x = Symptoms, y = value, col = Datasets, linetype = Type)) +
  geom_point(aes(x = Symptoms, y = value, col = Datasets), 
             shape = 21, fill = "white", size = 1.5, stroke = 1) +
  xlab("Symptoms") + ylab("Means") +
  scale_y_continuous() + 
  scale_x_continuous(breaks=c(1:16)) +
  theme_bw() +
  theme(panel.grid.minor=element_blank()) +
  coord_flip()

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM