![](/img/trans.png)
[英]Returning the highest and lowest correlations from a correlation matrix in pandas
[英]Get pairs of variables from correlation matrix that minimize the sum of correlations
这是一个不太有效的解决方案(它获得了 4 个特征中的 3 个,如果您将n_features
从3
更改为 6 ,则可以轻松扩展到 10 3
中的6
),尽管如此
import pandas as pd
foo = pd.DataFrame({'vars': ['col_a', 'col_b', 'col_c', 'col_d'],
'col_a': [1, 0.9, 0.04, 0.03],
'col_b': [0.9,1,0.05,0.03],
'col_c': [0.04, 0.05, 1, -0.04],
'col_d': [0.03, 0.03, -0.04,1]})
import numpy as np
import itertools
n_features = 3
test_cols = ['col_a', 'col_b', 'col_c', 'col_d']
sum_l = {}
for l in list(itertools.combinations(test_cols, n_features)):
sum_l2 = 0
for l2 in list(itertools.combinations(l, 2)):
sum_l2 += np.abs(foo.query('vars == @l2[0]')[l2[1]].values[0])
sum_l[l] = sum_l2
print(sum_l)
print(min(sum_l, key=sum_l.get))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.