[英]How to calculate p-value for Kendall Tau correlation coefficients in R?
I have calculated Kendal correlation coefficients using:我已经使用以下方法计算了 Kendal 相关系数:
corr_test <- cor.test(values, use = "pairwise", method="kendall")
corr_test
but I need the p-value.但我需要 p 值。 I cannot find any packages that provide a p-value for the Kendall correlations.我找不到任何为 Kendall 相关性提供 p 值的包。
How can I calculate the p-value for Kendall tau correlation coefficients?如何计算 Kendall tau 相关系数的 p 值?
The goal of this task is to generate a correlation plot, where colored cells indicate significant correlation coefficients.此任务的目标是生成相关图,其中彩色单元格表示显着的相关系数。 I am using Kendall tau because there are many ties in my data and one variable is a factor.我使用 Kendall tau 是因为我的数据中有很多联系,一个变量是一个因素。
You can simply iterate over the columns (or rows if you so please) of your data to use cor.test()
on each combination of columns as follows:您可以简单地遍历数据的列(或行,如果您愿意的话),以在每个列组合上使用cor.test()
,如下所示:
# Use some data
mat <- iris[,1:4]
# Index combinations of columns
# Not very efficient, but it'll do for now
idx <- expand.grid(colnames(mat), colnames(mat))
# Loop over indices, calculate p-value
pvals <- apply(idx, 1, function(i){
x <- mat[,i[[1]]]
y <- mat[,i[[2]]]
cor.test(x, y, method = "kendall")$p.value
})
# Combine indices with pvalues, do some sort of multiple testing correction
# Note that we are testing column combinations twice
# so we're overcorrecting with the FDR here
pvals <- cbind.data.frame(idx, pvals = p.adjust(pvals, "fdr"))
Next you would have to supplement these with the regular correlation values and combine these with the p-values.接下来,您必须用常规相关值补充这些值,并将这些值与 p 值结合起来。
# Calculate basic correlation
cors <- cor(mat, method = "kendall")
cors <- reshape2::melt(cors)
# Indices of correlations and pvalues should be the same, thus can be merged
if (identical(cors[,1:2], pvals[,1:2])) {
df <- cbind.data.frame(pvals, cor = cors[,3])
}
And plot the data in the following fashion:并以下列方式绘制数据:
# Plot a matrix
ggplot(df, aes(Var1, Var2, fill = ifelse(pvals < 0.05, cor, 0))) +
geom_raster() +
scale_fill_gradient2(name = "Significant Correlation", limits = c(-1, 1))
Another option is to use idx <- t(combn(colnames(mat), 2))
, in which case multiple testing corrections are appropriate, but you'll have to figure out how to manipulate these values to match up with the correlations again.另一种选择是使用idx <- t(combn(colnames(mat), 2))
,在这种情况下,多次测试更正是合适的,但您必须弄清楚如何操纵这些值以再次与相关性匹配.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.