简体   繁体   English

R,Corrplot不会使信号空白

[英]R, corrplot not blanking insig

I am desperately trying to solve an issue with R corrplot. 我拼命试图解决R corrplot的问题。 I have a small matrix that I would like to visualize using R and the corrplot() function. 我有一个小的矩阵,我想使用R和corrplot()函数来可视化。 I am using the following script to produce my corrplot: 我正在使用以下脚本来生成我的Corrplot:

library(corrplot)
corr_rohdaten<-read.csv(file="path_only.txt", sep="\t", header=TRUE)
M<-cor(corr_rohdaten)
cor.mtest <- function(mat, conf.level = 0.95) {
    mat <- as.matrix(mat)
    n <- ncol(mat)
    p.mat <- lowCI.mat <- uppCI.mat <- matrix(NA, n, n)
    diag(p.mat) <- 0
    diag(lowCI.mat) <- diag(uppCI.mat) <- 1
    for (i in 1:(n - 1)) {
        for (j in (i + 1):n) {
            tmp <- cor.test(mat[, i], mat[, j], conf.level = conf.level)
            p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
            lowCI.mat[i, j] <- lowCI.mat[j, i] <- tmp$conf.int[1]
            uppCI.mat[i, j] <- uppCI.mat[j, i] <- tmp$conf.int[2]
        }
    }
    return(list(p.mat, lowCI.mat, uppCI.mat))
}

res1 <- cor.mtest(M, 0.95)
res2 <- cor.mtest(M, 0.99)
## specialized the insignificant value according to the significant level
##corrplot(M, p.mat = res1[[1]], sig.level = 0.2)
corrplot(M, p.mat = res1[[1]], insig = "blank", method="color", tl.col="black", type="lower", tl.cex=1.2)

The problem is now, that the corrplot shows a lot of insig fields that should according to spss and R be nonsigificant and be cleared out. 现在的问题是,corrplot显示了很多insig字段,这些字段应根据spss和R无关紧要并清除。

path_only.txt looks as the following: path_only.txt如下所示:

MyoD9w  MyoD52w TNC9w   TNC52w  OxCI9w  OxCI52w OxCII9w OxCII52w    OxCIII9w    OxCIII52w   OxCV9w  OxCV52w Myogenin9w  Myogenin52w VEGF9w  VEGF52w COX4I29w    COX4I252w
0.6508  0.3862  0.0888  0.8239  1.3390  4.2471  2.7513  6.1901  4.9440  6.0180  1.1619  0.9240  2.0130  1.5483  1.0000  0.6016  0.7826  0.8244
0.1956  0.3954  0.1959  3.1073  1.3103  2.2127  0.9862  5.4116  1.7748  5.3906  0.7411  1.0165  2.1557  2.6942  1.1199  1.0128  1.4144  1.8681
0.6217  1.0000  0.4912  1.0000  0.4237  2.1208  0.7313  2.7154  0.5653  0.9250  0.9000  0.7145  4.8147  6.3509  1.2985  1.4768  2.2194  1.0000

My assumption is that something here is wrong with calculating or comparing the p-values or they get rounded. 我的假设是,此处计算或比较p值有误,否则会四舍五入。

Maybe for someone like you the mistake is visible within seconds. 也许对于像您这样的人,错误会在几秒钟内显现出来。 I spent hours on googling. 我花了几个小时在谷歌搜索。

Another question that I would be interested to solve is: I would love to only show correlations with p<0.05 and r^2 > 0.7 resp. 我想解决的另一个问题是:我希望只显示p <0.05和r ^ 2> 0.7的相关性。 r^2 < 0.7. r ^ 2 <0.7。 If that can be done within this graph I will ship over some beers to the first one solving this issue properly! 如果可以在这张图中完成,那么我将把啤酒卖给第一个正确解决此问题的啤酒!

I think you want the significance of the correlation between columns in corr_rohdaten . 我认为您想要corr_rohdaten列之间的相关性的corr_rohdaten Here is a way to extract statistics from the cor.test function using outer to apply the test to the various combinations of columns. 这是一种通过使用outer将测试应用于各种列组合的方法,从cor.test函数中提取统计信息的方法。

## Define a function to operate on combinations of columns, 
## vectorized for arguments x and y so we can pass it to `outer`
f <- Vectorize(function(x, y, data, statistic, dim=1, ...) 
    cor.test(data[,x], data[,y], ...)[[statistic]][[dim]], vec=c("x", "y"))

## Wrap this in a another function to get the p.value and confidence intervals
myCorr <- function(data, conf.level=0.95) {
    pvals <- outer(names(data), names(data), f, 
                   data=data, statistic="p.value",          # additional arguments to f
                   conf.level=conf.level)                   # additional to cor.test
    lower <- outer(names(data), names(data), f, data=data, statistic="conf.int")
    upper <- outer(names(data), names(data), f, data=data, statistic="conf.int", dim=2)
    list(pvals=pvals, lower=lower, upper=upper)
}

## To recreate the plot you have, you would need to test the correlation between
## the columns of the correlation matrix, M.  I wasn't sure if this was what you wanted.
pvals2 <- outer(1:ncol(M), 1:ncol(M), f, data=M, statistic="p.value", conf.level=0.95)

## Make corrplot using p.values from test between columns of corr_rohdaten
## The upper and lower confidence intervals will be NULL because there are only
## 3 observations (conf.int requires at least 4)
stats <- myCorr(corr_rohdaten, conf.level = 0.95)

library(corrplot)
corrplot(M, p.mat = stats[["pvals"]], insig = "blank", method="color", tl.col="black", 
         type="lower", tl.cex=1.2, diag=F)

在此处输入图片说明

This isn't the most efficient way to do this. 这不是最有效的方法。 It will rerun the correlation tests three times each, but I find it reasonably simple and outer is quite fast. 它将每次重新运行相关性测试3次,但是我发现它相当简单,并且outer速度相当快。

To fix your function to reproduce these results is quite simple. 修复函数以重现这些结果非常简单。 Just pass it the corr_rohdatem instead of M , and add a check for enough observations before extracting conf.int . 只需将其传递给corr_rohdatem而不是M ,并在提取conf.int之前添加检查是否足够的观察值conf.int

cor.mtest <- function(mat, conf.level = 0.95) {
    mat <- as.matrix(mat)
    n <- ncol(mat)
    p.mat <- lowCI.mat <- uppCI.mat <- matrix(NA, n, n)
    diag(p.mat) <- 0
    diag(lowCI.mat) <- diag(uppCI.mat) <- 1
    for (i in 1:(n - 1)) {
        for (j in (i + 1):n) {
            tmp <- cor.test(mat[, i], mat[, j], conf.level = conf.level)
            p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
            if (dim(mat)[1] > 3) {
                lowCI.mat[i, j] <- lowCI.mat[j, i] <- tmp$conf.int[1]
                uppCI.mat[i, j] <- uppCI.mat[j, i] <- tmp$conf.int[2]
            }
        }
    }
    list(p=p.mat, lower=lowCI.mat, upper=uppCI.mat)
}

res <- cor.mtest(corr_rhodaten)
corrplot(M, p.mat = res[["p"]], insig = "blank", method="color", tl.col="black", 
         type="lower", tl.cex=1.2, diag=F)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM