简体   繁体   English

python 向量数组的协方差矩阵

[英]python covariance matrix of array of vectors

I have an array of size 4 vectors(which we could consider 4-tuples).我有一个大小为 4 的向量数组(我们可以考虑 4 元组)。 I want to find the covariance matrix but if I call self.cov I get a huge matrix whilst I'm expecting a 4x4.我想找到协方差矩阵,但如果我调用 self.cov,我会得到一个巨大的矩阵,而我期待的是 4x4。 The code is simply print(np.cov(iris_separated[0])) where iris_separated[0] is the setosas from the iris dataset.代码很简单print(np.cov(iris_separated[0]))其中 iris_separated[0] 是来自 iris 数据集的 setosas。

print(iris_separated[0]) looks like this print(iris_separated[0]) 看起来像这样

[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]
 [5.4 3.9 1.7 0.4]
 [4.6 3.4 1.4 0.3]
 [5.  3.4 1.5 0.2]
 [4.4 2.9 1.4 0.2]
 [4.9 3.1 1.5 0.1]
 [5.4 3.7 1.5 0.2]
 [4.8 3.4 1.6 0.2]
 [4.8 3.  1.4 0.1]
 [4.3 3.  1.1 0.1]
 [5.8 4.  1.2 0.2]
 [5.7 4.4 1.5 0.4]
 [5.4 3.9 1.3 0.4]
 [5.1 3.5 1.4 0.3]
 [5.7 3.8 1.7 0.3]
 [5.1 3.8 1.5 0.3]
 [5.4 3.4 1.7 0.2]
 [5.1 3.7 1.5 0.4]
 [4.6 3.6 1.  0.2]
 [5.1 3.3 1.7 0.5]
 [4.8 3.4 1.9 0.2]
 [5.  3.  1.6 0.2]
 [5.  3.4 1.6 0.4]
 [5.2 3.5 1.5 0.2]
 [5.2 3.4 1.4 0.2]
 [4.7 3.2 1.6 0.2]
 [4.8 3.1 1.6 0.2]
 [5.4 3.4 1.5 0.4]
 [5.2 4.1 1.5 0.1]
 [5.5 4.2 1.4 0.2]
 [4.9 3.1 1.5 0.2]
 [5.  3.2 1.2 0.2]
 [5.5 3.5 1.3 0.2]
 [4.9 3.6 1.4 0.1]
 [4.4 3.  1.3 0.2]
 [5.1 3.4 1.5 0.2]
 [5.  3.5 1.3 0.3]
 [4.5 2.3 1.3 0.3]
 [4.4 3.2 1.3 0.2]
 [5.  3.5 1.6 0.6]
 [5.1 3.8 1.9 0.4]
 [4.8 3.  1.4 0.3]
 [5.1 3.8 1.6 0.2]
 [4.6 3.2 1.4 0.2]
 [5.3 3.7 1.5 0.2]
 [5.  3.3 1.4 0.2]]

And I'm expecting a 4x4 covariance matrix, instead I'm getting a huge matrix of a lot of dimensions.而且我期待一个 4x4 协方差矩阵,而不是我得到一个有很多维度的巨大矩阵。

[[4.75       4.42166667 4.35333333 ... 4.23       4.945      4.60166667]
 [4.42166667 4.14916667 4.055      ... 3.93833333 4.59916667 4.29583333]
 [4.35333333 4.055      3.99       ... 3.87666667 4.53166667 4.21833333]
 ...
 [4.23       3.93833333 3.87666667 ... 3.77       4.405      4.09833333]
 [4.945      4.59916667 4.53166667 ... 4.405      5.14916667 4.78916667]
 [4.60166667 4.29583333 4.21833333 ... 4.09833333 4.78916667 4.4625    ]]

print(np.cov(iris_separated[0],rowvar=False)) fixes the problem, so does using.T on the data print(np.cov(iris_separated[0],rowvar=False)) 解决了问题, using.T 在数据上也是如此

You need to transpose the matrix.您需要转置矩阵。 Each column represents an observation and each row represents a variable.每列代表一个观察值,每一行代表一个变量。 Therefore, it should be np.cov(iris_seperated[0].T) .因此,它应该是np.cov(iris_seperated[0].T) Please refer the docs请参考文档

https://docs.scipy.org/doc/numpy/reference/generated/numpy.cov.html https://docs.scipy.org/doc/numpy/reference/generated/numpy.cov.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM