简体繁体 English

sklearn PCA 是 pca.components_ 的加载项？

[英]Sklearn PCA is pca.components_ the loadings?

原文 2016-04-03 00:16:02 3 2 python/ scikit-learn/ pca

Sklearn PCA is pca.components_ the loadings? sklearn PCA 是 pca.components_ 的加载项？ I am pretty sure it is, but I am trying to follow along a research paper and I am getting different results from their loadings.我很确定是这样，但我正在尝试遵循一篇研究论文，但我从他们的加载中得到了不同的结果。 I can't find it within the sklearn documentation.我在 sklearn 文档中找不到它。

2 个解决方案

pca.components_ is the orthogonal basis of the space your projecting the data into. pca.components_是将数据投影到的空间的正交基。 It has shape (n_components, n_features) .它有形状(n_components, n_features) 。 If you want to keep the only the first 3 components (for instance to do a 3D scatter plot) of a datasets with 100 samples and 50 dimensions (also named features), pca.components_ will have shape (3, 50) .如果您想保留具有 100 个样本和 50 个维度（也称为特征）的数据集的前 3 个组件（例如做 3D 散点图），则pca.components_将具有形状(3, 50) 。

I think what you call the "loadings" is the result of the projection for each sample into the vector space spanned by the components.我认为你所说的“加载”是每个样本到由组件跨越的向量空间的投影结果。 Those can be obtained by calling pca.transform(X_train) after calling pca.fit(X_train) .这些可以通过在调用pca.transform(X_train)之后调用pca.fit(X_train) 。 The result will have shape (n_samples, n_components) , that is (100, 3) for our previous example.结果将具有形状(n_samples, n_components) ，即我们之前的示例的(100, 3) 。

This previous answer is mostly correct except about the loadings.除了关于负载之外，之前的答案大部分是正确的。 components_ is in fact the loadings, as the question asker originally stated. components_ 实际上是负载，正如提问者最初所说的那样。 The result of the fit_transform function will give you the principal components (the transformed/reduced matrix). fit_transform 函数的结果将为您提供主成分（变换/缩减矩阵）。