简体   繁体   English

sklearn PCA无法正常工作

[英]sklearn PCA not working

I have been playing around with sklearn PCA and it is behaving oddly. 我一直在玩sklearn PCA,它表现得很奇怪。

from sklearn.decomposition import PCA
import numpy as np
identity = np.identity(10)
pca = PCA(n_components=10)
augmented_identity = pca.fit_transform(identity)
np.linalg.norm(identity - augmented_identity)

4.5997749080745738

Note that I set the number of dimensions to be 10. Shouldn't the norm be 0? 请注意,我将维度数设置为10.标准不应该是0吗?

Any insight into why it is not would be appreciated. 任何洞察它为什么不是将不胜感激。

Although PCA computes the orthogonal components based on covariance matrix, the input to PCA in sklearn is the data matrix instead of covairance/correlation matrix. 虽然PCA基于协方差矩阵计算正交分量,但是sklearn中PCA的输入是数据矩阵而不是协方差/相关矩阵。

import numpy as np
from sklearn.decomposition import PCA

# gaussian random variable, 10-dimension, identity cov mat
X = np.random.randn(100000, 10)



pca = PCA(n_components=10)
X_transformed = pca.fit_transform(X)

np.linalg.norm(np.cov(X.T) - np.cov(X_transformed.T))

Out[219]: 0.044691263454134933

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM