简体   繁体   English

R中的分组条形图带有误差条

[英]Grouped barplot in R with error bars

I would like to draw a grouped barplot with error bars. 我想绘制一个带有误差条的分组条形图。 Here is the kind of figure I have been able to get up to now, and this is ok for what I need: 这是我能够达到的那种形象,这对我需要的东西是可以的:

在此输入图像描述

And here is my script: 这是我的脚本:

#create dataframe
Gene<-c("Gene1","Gene2","Gene1","Gene2")
count1<-c(12,14,16,34)
count2<-c(4,7,9,23)
count3<-c(36,22,54,12)
count4<-c(12,24,35,23)
Species<-c("A","A","B","B")
df<-data.frame(Gene,count1,count2,count3,count4,Species)
df

mean1<-mean(as.numeric(df[1,][c(2,3,4,5)]))
mean2<-mean(as.numeric(df[2,][c(2,3,4,5)]))
mean3<-mean(as.numeric(df[3,][c(2,3,4,5)]))
mean4<-mean(as.numeric(df[4,][c(2,3,4,5)]))
Gene1SpeciesA.stdev<-sd(as.numeric(df[1,][c(2,3,4,5)]))
Gene2SpeciesA.stdev<-sd(as.numeric(df[2,][c(2,3,4,5)]))
Gene1SpeciesB.stdev<-sd(as.numeric(df[3,][c(2,3,4,5)]))
Gene2SpeciesB.stdev<-sd(as.numeric(df[4,][c(2,3,4,5)]))

ToPlot<-c(mean1,mean2,mean3,mean4)

#plot barplot
plot<-matrix(ToPlot,2,2,byrow=TRUE)   #with 2 being replaced by the number of genes!
tplot<-t(plot)
BarPlot <- barplot(tplot, beside=TRUE,ylab="count",
                names.arg=c("Gene1","Gene2"),col=c("blue","red"))

#add legend
legend("topright", 
       legend = c("SpeciesA","SpeciesB"), 
       fill = c("blue","red"))

#add error bars
ee<-matrix(c(Gene1SpeciesA.stdev,Gene2SpeciesA.stdev,Gene1SpeciesB.stdev,Gene2SpeciesB.stdev),2,2,byrow=TRUE)*1.96/sqrt(4)   
tee<-t(ee)
error.bar(BarPlot,tplot,tee)

The problem is that I need to do this for 50 genes, and 4 species, so my script is gonna get super super long and I guess this is not optimized... I tried to find help here but I can't figure out a better way to do what I'd like. 问题是我需要为50个基因和4个物种做这个,所以我的脚本会超级超长,我想这不是优化的......我试图在这里找到帮助但我无法弄清楚做我想做的更好的方法。 If I did not need error bars I could adapt this script but the tricky part is to mix ggplot beautiful barplots and error bars! 如果我不需要错误栏我可以调整这个脚本,但棘手的部分是混合ggplot美丽的条形图和错误栏! ;) ;)

If you have any idea to optimize my script, I would really appreciate! 如果你有任何想法来优化我的脚本,我真的很感激! :) :)

Thanks a lot! 非常感谢!

Starting from your definition of df , you can do this in a few lines: df的定义开始,您可以通过几行来完成:

library(ggplot2)

cols = c(2,3,4,5)
df1  = transform(df, mean=rowMeans(df[cols]), sd=apply(df[cols],1, sd))

# df1 looks like this
#   Gene count1 count2 count3 count4 Species  mean        sd
#1 Gene1     12      4     36     12       A 16.00 13.856406
#2 Gene2     14      7     22     24       A 16.75  7.804913
#3 Gene1     16      9     54     35       B 28.50 20.240224
#4 Gene2     34     23     12     23       B 23.00  8.981462

ggplot(df1, aes(x=as.factor(Gene), y=mean, fill=Species)) +
  geom_bar(position=position_dodge(), stat="identity", colour='black') +
  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width=.2,position=position_dodge(.9))

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM