简体   繁体   English

根据值和出现次数过滤相关矩阵

[英]filter a correlation matrix based on value and occurrence

Does anyone have a way to filter a correlation matrix (or list of correlations) based on a ranking that includes value and breadth?有没有人有办法根据包含价值和广度的排名来过滤相关矩阵(或相关列表)? For example, if a certain variable has a high enough correlation with a large enough number of other variables, then keep it.例如,如果某个变量与足够多的其他变量具有足够高的相关性,则保留它。 If a variable does not meet these criteria, filter it out.如果变量不符合这些条件,请将其过滤掉。

as an example: if a correlation > 0.25 is found in > 3 entries, keep this variable.例如:如果在 > 3 个条目中发现相关性 > 0.25,请保留此变量。 If not, discard the variable.如果不是,则丢弃该变量。

Currently I'm able to construct a correlation matrix and filter it based on values, but have not been able to progress past this.目前我能够构建一个相关矩阵并根据值对其进行过滤,但无法超越这一点。 For filtering, I'm setting values below my threshold to 0对于过滤,我将低于阈值的值设置为 0

correlation_matrix <- round(cor(data, method = "pearson", use = "pairwise.complete.obs"), digits = 4)
correlation_matrix[correlation_matrix < 0.13 & correlation_matrix > -0.13] <- 0

I've now done this using apply as Rui mentioned above.我现在已经使用上面提到的Rui 应用程序完成了这项工作。

This is code to select all rows (and columns) in the correlation matrix that contain at least 75 (breadth) values over 0.2 (threshold):这是选择相关矩阵中包含至少 75 个(宽度)值超过 0.2(阈值)的所有行(和列)的代码:

1) define variables; 1) 定义变量; set diagonal values from 1 to 0设置从 1 到 0 的对角线值

threshold <- 0.2
breadth <- 75
correlation_matrix_filter <- correlation_matrix
diag(correlation_matrix_filter) <- 0

2) count how many values per row are greater than the threshold of 0.2 2) 计算每行有多少个值大于阈值 0.2

filter <- apply(correlation_matrix_filter,1, function(x) sum(abs(x) >= threshold))

3) select only rows containing 75 values greater than the threshold; 3) 只选择包含大于阈值的 75 个值的行; subset the original correlation matrix to only include these rows (and columns)将原始相关矩阵子集以仅包含这些行(和列)

sel <- filter >= breadth
correlation_matrix_final <- correlation_matrix[sel,sel]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM