[英]looping through columns in a R dataframe
我正在用R編寫代碼,可以繪制直方圖以及中位數和四分位數,但是在數據框列中存在循環問題。
您可以在附件中找到我的數據框的標題和代碼。
最后,生成直方圖,但中位數和四分位數與實際分布無關。
ROI DOY_119 DOY_127 DOY_143 DOY_151 DOY_175 DOY_191 DOY_215 DOY_239 DOY_263
1 4 -11.592668 -9.457701 -12.57275 -11.073490 -8.999743 -9.132843 -9.995659 -9.511699 -9.393022
2 4 -11.518109 -10.231917 -11.96543 -10.757207 -9.558524 -8.529423 -9.562449 -9.511699 -9.578184
3 4 -9.633711 -9.488475 -12.09012 -10.357404 -8.535619 -8.294449 -9.179331 -7.652297 -6.952941
4 4 -7.752080 -9.578184 -11.30182 -11.073490 -8.992849 -6.197888 -6.556077 -5.883803 -6.324577
5 4 -12.533658 -9.347939 -12.74088 -10.506100 -8.958544 -10.486625 -10.809219 -10.550241 -9.307020
6 4 -13.036436 -8.054857 -13.45823 -9.122186 -7.654827 -10.159230 -10.423927 -11.319436 -10.736576
for (i in 2:ncol(fileIn)){
myHist <- paste(directory, (i-1), sep="")
x11(width = 50, height = 50)
medi <- ddply(fileIn, "ROI", summarise, grp.medi=median (as.numeric(as.matrix(fileIn[i]))))
q05 <- ddply(fileIn, "ROI", summarise, grp.q05=quantile(as.numeric(as.matrix(fileIn[i]))),0.05)
q25 <- ddply(fileIn, "ROI", summarise, grp.q25=quantile(as.numeric(as.matrix(fileIn[i]))),0.25)
q75 <- ddply(fileIn, "ROI", summarise, grp.q75=quantile(as.numeric(as.matrix(fileIn[i]))),0.75)
q95 <- ddply(fileIn, "ROI", summarise, grp.q95=quantile(as.numeric(as.matrix(fileIn[i]))),0.95)
plotHist <-
ggplot(fileIn) +
aes(x = as.numeric(as.matrix(fileIn[i,]))) +
# aes(x = DOY_119) +
geom_histogram(alpha = 0.5, binwidth = 0.5, color="grey", fill= "yellow") +
geom_density(color = "green", fill= "green", alpha = 0.5) +
geom_vline(data=medi, aes(xintercept=grp.medi), color="red", size = 0.7) +
geom_vline(data=q05, aes(xintercept=grp.q05), color="black", size = 0.3) +
geom_vline(data=q25, aes(xintercept=grp.q25), color="blue", size = 0.5) +
geom_vline(data=q75, aes(xintercept=grp.q75), color="blue", size = 0.5) +
geom_vline(data=q95, aes(xintercept=grp.q95), color="black", size = 0.3) +
theme(axis.text.x = element_text(colour = "black"),
axis.text.y = element_text(colour = "black")) +
facet_wrap( ~ ROI, scales = "free")
plot(plotHist)
#------------------------------------------------------------------------------------------------------
# salvataggio X11
dev.copy(jpeg, myHist, width=2000, height=1000, res=100)
dev.off()
}
這是您的起點。 這是一種可以通過多種方式解決的問題,一個問題並不總是比另一種更好。 在您的代碼中,您所做的某些工作效率很低(例如計算分位數和創建vline
)。 通常在ggplot中,如果您發現自己重復非常相似的行(例如對vline的5次調用),則有更好的方法。 我已經替換為一個'vline_data'的計算, and fed that to
與一些手動比例一起and fed that to
geom_vline`中。
#add second ROI for plotting/demonstration purposes
fileIn2 <- fileIn
fileIn2$ROI <- 5
fileIn <- rbind(fileIn,fileIn2)
myplots <- lapply(colnames(fileIn)[-1],function(col_of_interest){
#create summary_data for quantiles
vline_data <- ddply(fileIn,.(ROI), function(x){
myprobs=c(0.05,0.25,0.5,0.75,0.95)
res <- data.frame(prob=as.character(myprobs),value=quantile(x[,col_of_interest],probs=myprobs) )
res
})
#create plot. Note the use of aes_string here.
plotHist <-
ggplot(fileIn, aes_string(x=col_of_interest))+
geom_histogram(alpha = 0.5, binwidth = 0.5, color="grey", fill= "yellow") +
geom_density(color = "green", fill= "green", alpha = 0.5) +
geom_vline(data=vline_data, aes(xintercept=value,size=prob,color=prob)) +
scale_color_manual(values=c("black","blue","red","black","blue"),
breaks=as.character(c(0.05,0.25,0.5,0.75,0.95)))+
scale_size_manual(values=c(0.3,0.5,0.7,0.5,0.3),
breaks=as.character(c(0.05,0.25,0.5,0.75,0.95)))+
facet_wrap( ~ ROI, scales = "free")
#optional: use `ggsave` here.
#ggsave(file=paste(directory,col_of_interest,".png"),plot=plotHist)
return(plotHist)
}
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.