简体   繁体   English

计算线性回归斜率矩阵(类似于相关矩阵)-Python/Pandas

[英]calculate linear regression slope matrix (analogous to correlation matrix) - Python/Pandas

Pandas has a really nice function that gives you a correlation matrix Data Frame for your data DataFrame, pd.DataFrame.corr() . Pandas 有一个非常好的函数,它为您的数据 DataFrame pd.DataFrame.corr()提供了一个相关矩阵 Data Frame。

The r of a correlation, however, isn't always that informative.然而,相关性的 r 并不总是那么有用。 Depending on your application the slope of the linear regression might be just as important.根据您的应用,线性回归的斜率可能同样重要。 Is there any function that can return that for an input matrix or dataframe?是否有任何函数可以为输入矩阵或数据帧返回该函数?

Other than iterating with scipy.stats.linregress() , which would be a pain, I don't see any way to do this?除了使用scipy.stats.linregress() 进行迭代之外,这会很痛苦,我看不出有什么方法可以做到这一点?

Slope of a regression line y=b 0 + b 1 * x can also be calculated using the correlation coefficient: b 1 = corr(x, y) * σ x / σ y回归线的斜率 y=b 0 + b 1 * x 也可以使用相关系数计算:b 1 = corr(x, y) * σ x / σ y

Using numpy's newaxis to create the σ x / σ y matrix:使用 numpy 的 newaxis 创建 σ x / σ y矩阵:

df.corr() * (df.std().values / df.std().values[:, np.newaxis])
Out[59]: 
          A         B         C
A  1.000000 -0.686981  0.252078
B -0.473282  1.000000 -0.263359
C  0.137670 -0.208775  1.000000

where df is:其中df是:

df
Out[60]: 
   A  B  C
0  5  6  9
1  4  4  2
2  7  3  5
3  4  3  9
4  6  5  3
5  3  8  6
6  2  8  1
7  7  2  7
8  4  1  5
9  1  6  6

And this is for verification:这是为了验证:

res = []
for col1, col2 in itertools.product(df.columns, repeat=2):
    res.append(linregress(df[col1], df[col2]).slope)
np.array(res).reshape(3, 3)
Out[72]: 
array([[ 1.        , -0.68698061,  0.25207756],
       [-0.47328244,  1.        , -0.26335878],
       [ 0.1376702 , -0.20877458,  1.        ]])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM