[英]R, corrplot not blanking insig
I am desperately trying to solve an issue with R corrplot. 我拼命试图解决R corrplot的问题。 I have a small matrix that I would like to visualize using R and the
corrplot()
function. 我有一个小的矩阵,我想使用R和
corrplot()
函数来可视化。 I am using the following script to produce my corrplot: 我正在使用以下脚本来生成我的Corrplot:
library(corrplot)
corr_rohdaten<-read.csv(file="path_only.txt", sep="\t", header=TRUE)
M<-cor(corr_rohdaten)
cor.mtest <- function(mat, conf.level = 0.95) {
mat <- as.matrix(mat)
n <- ncol(mat)
p.mat <- lowCI.mat <- uppCI.mat <- matrix(NA, n, n)
diag(p.mat) <- 0
diag(lowCI.mat) <- diag(uppCI.mat) <- 1
for (i in 1:(n - 1)) {
for (j in (i + 1):n) {
tmp <- cor.test(mat[, i], mat[, j], conf.level = conf.level)
p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
lowCI.mat[i, j] <- lowCI.mat[j, i] <- tmp$conf.int[1]
uppCI.mat[i, j] <- uppCI.mat[j, i] <- tmp$conf.int[2]
}
}
return(list(p.mat, lowCI.mat, uppCI.mat))
}
res1 <- cor.mtest(M, 0.95)
res2 <- cor.mtest(M, 0.99)
## specialized the insignificant value according to the significant level
##corrplot(M, p.mat = res1[[1]], sig.level = 0.2)
corrplot(M, p.mat = res1[[1]], insig = "blank", method="color", tl.col="black", type="lower", tl.cex=1.2)
The problem is now, that the corrplot shows a lot of insig fields that should according to spss and R be nonsigificant and be cleared out. 现在的问题是,corrplot显示了很多insig字段,这些字段应根据spss和R无关紧要并清除。
path_only.txt looks as the following: path_only.txt如下所示:
MyoD9w MyoD52w TNC9w TNC52w OxCI9w OxCI52w OxCII9w OxCII52w OxCIII9w OxCIII52w OxCV9w OxCV52w Myogenin9w Myogenin52w VEGF9w VEGF52w COX4I29w COX4I252w
0.6508 0.3862 0.0888 0.8239 1.3390 4.2471 2.7513 6.1901 4.9440 6.0180 1.1619 0.9240 2.0130 1.5483 1.0000 0.6016 0.7826 0.8244
0.1956 0.3954 0.1959 3.1073 1.3103 2.2127 0.9862 5.4116 1.7748 5.3906 0.7411 1.0165 2.1557 2.6942 1.1199 1.0128 1.4144 1.8681
0.6217 1.0000 0.4912 1.0000 0.4237 2.1208 0.7313 2.7154 0.5653 0.9250 0.9000 0.7145 4.8147 6.3509 1.2985 1.4768 2.2194 1.0000
My assumption is that something here is wrong with calculating or comparing the p-values or they get rounded. 我的假设是,此处计算或比较p值有误,否则会四舍五入。
Maybe for someone like you the mistake is visible within seconds. 也许对于像您这样的人,错误会在几秒钟内显现出来。 I spent hours on googling.
我花了几个小时在谷歌搜索。
Another question that I would be interested to solve is: I would love to only show correlations with p<0.05 and r^2 > 0.7 resp. 我想解决的另一个问题是:我希望只显示p <0.05和r ^ 2> 0.7的相关性。 r^2 < 0.7.
r ^ 2 <0.7。 If that can be done within this graph I will ship over some beers to the first one solving this issue properly!
如果可以在这张图中完成,那么我将把啤酒卖给第一个正确解决此问题的啤酒!
I think you want the significance of the correlation between columns in corr_rohdaten
. 我认为您想要
corr_rohdaten
列之间的相关性的corr_rohdaten
。 Here is a way to extract statistics from the cor.test
function using outer
to apply the test to the various combinations of columns. 这是一种通过使用
outer
将测试应用于各种列组合的方法,从cor.test
函数中提取统计信息的方法。
## Define a function to operate on combinations of columns,
## vectorized for arguments x and y so we can pass it to `outer`
f <- Vectorize(function(x, y, data, statistic, dim=1, ...)
cor.test(data[,x], data[,y], ...)[[statistic]][[dim]], vec=c("x", "y"))
## Wrap this in a another function to get the p.value and confidence intervals
myCorr <- function(data, conf.level=0.95) {
pvals <- outer(names(data), names(data), f,
data=data, statistic="p.value", # additional arguments to f
conf.level=conf.level) # additional to cor.test
lower <- outer(names(data), names(data), f, data=data, statistic="conf.int")
upper <- outer(names(data), names(data), f, data=data, statistic="conf.int", dim=2)
list(pvals=pvals, lower=lower, upper=upper)
}
## To recreate the plot you have, you would need to test the correlation between
## the columns of the correlation matrix, M. I wasn't sure if this was what you wanted.
pvals2 <- outer(1:ncol(M), 1:ncol(M), f, data=M, statistic="p.value", conf.level=0.95)
## Make corrplot using p.values from test between columns of corr_rohdaten
## The upper and lower confidence intervals will be NULL because there are only
## 3 observations (conf.int requires at least 4)
stats <- myCorr(corr_rohdaten, conf.level = 0.95)
library(corrplot)
corrplot(M, p.mat = stats[["pvals"]], insig = "blank", method="color", tl.col="black",
type="lower", tl.cex=1.2, diag=F)
This isn't the most efficient way to do this. 这不是最有效的方法。 It will rerun the correlation tests three times each, but I find it reasonably simple and
outer
is quite fast. 它将每次重新运行相关性测试3次,但是我发现它相当简单,并且
outer
速度相当快。
To fix your function to reproduce these results is quite simple. 修复函数以重现这些结果非常简单。 Just pass it the
corr_rohdatem
instead of M
, and add a check for enough observations before extracting conf.int
. 只需将其传递给
corr_rohdatem
而不是M
,并在提取conf.int
之前添加检查是否足够的观察值conf.int
。
cor.mtest <- function(mat, conf.level = 0.95) {
mat <- as.matrix(mat)
n <- ncol(mat)
p.mat <- lowCI.mat <- uppCI.mat <- matrix(NA, n, n)
diag(p.mat) <- 0
diag(lowCI.mat) <- diag(uppCI.mat) <- 1
for (i in 1:(n - 1)) {
for (j in (i + 1):n) {
tmp <- cor.test(mat[, i], mat[, j], conf.level = conf.level)
p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
if (dim(mat)[1] > 3) {
lowCI.mat[i, j] <- lowCI.mat[j, i] <- tmp$conf.int[1]
uppCI.mat[i, j] <- uppCI.mat[j, i] <- tmp$conf.int[2]
}
}
}
list(p=p.mat, lower=lowCI.mat, upper=uppCI.mat)
}
res <- cor.mtest(corr_rhodaten)
corrplot(M, p.mat = res[["p"]], insig = "blank", method="color", tl.col="black",
type="lower", tl.cex=1.2, diag=F)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.