简体   繁体   English

如何计算r中二进制变量之间的相关矩阵?

[英]How to calculate correlation matrix between binary variables in r?

I have dataframe of 10 binary variables, looked like this:我有 10 个二进制变量的数据框,如下所示:

V1 V2 V3...
0  1  1
1  1  0
1  0  1
0  0  1  

I need to get the correlation matrix then I can do factor analysis.我需要得到相关矩阵,然后才能进行因子分析。
psych::corr.test can calculate calculate the correlation matrix,but has only person , spearman , kendall methods,not used for binary data. psych::corr.test可以计算相关矩阵,但只有personspearmankendall方法,不用于二进制数据。
Then, how to calculate the correlation matrix of this dataframe?那么,如何计算这个数据帧的相关矩阵呢?

Correl methods are suitable for continuous data.相关方法适用于连续数据。 https://www.quora.com/Is-it-possible-to-calculate-correlations-between-binary-variables https://www.quora.com/Is-it-possible-to-calculate-correlations-between-binary-variables

Can u you try non parametric methods try http://www.cedar.buffalo.edu/papers/articles/CVPRIP03_propbina.pdf你能试试非参数方法吗http://www.cedar.buffalo.edu/papers/articles/CVPRIP03_propbina.pdf

You can still achieve factor analysis, calculate % match and remove variable matching >x%.您仍然可以实现因子分析,计算匹配百分比并删除变量匹配>x%。 This way you can remove the dimension of the data.这样您就可以删除数据的维度。

# create data
m <- matrix(sample(x = 0:1,size = 200,replace = T),ncol = 10)
colnames(m) <- LETTERS[1:10]
m
# create cor matrix
res <- data.frame()
for(i in seq(ncol(m))){
  z <- m[,i]
  z <- apply(m,2,function(x){sum(x==z)/length(z)})
  res <- rbind(res,z)
}
colnames(res) <- colnames(m)
rownames(res) <- colnames(m)
res <- as.matrix(res)
res

You can use hierarchical clustering on columns您可以在列上使用层次聚类

hclus(x) hclus(x)

or even better you can choose a clustering method from "ward.D", "ward.D2", "single", "complete"... https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/hclust或者更好的是,您可以从“ward.D”、“ward.D2”、“单一”、“完整”中选择一种聚类方法... https://www.rdocumentation.org/packages/stats/versions/3.6。 2/主题/hclust

Another solution will be to visualize your binary matrix as a heatmap, a similar variable with common features另一种解决方案是将您的二进制矩阵可视化为热图,这是一个具有共同特征的类似变量

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM