使用sklearn提取PCA組件

Question

我正在使用sklearn的PCA來減少大量圖像的維數。 一旦安裝了PCA，我想看看組件的外觀。

可以通過查看components_屬性來實現。 沒有意識到這是可用的，我做了別的事情：

each_component = np.eye(total_components)
component_im_array = pca.inverse_transform(each_component)

for i in range(num_components):
   component_im = component_im_array[i, :].reshape(height, width)
   # do something with component_im

換句話說，我在PCA空間中創建了一個具有所有特征但是設置為0的圖像。通過對它們進行反變換，我應該在原始空間中獲取圖像，一旦轉換，就可以用該PCA組件單獨表示。。

下圖顯示了結果。 左邊是使用我的方法計算的組件。 右邊是pca.components_[i] 。 另外，使用我的方法，大多數圖像非常相似（但它們是不同的），而通過訪問components_ _圖像是非常不同的，因為我預期

我的方法中存在概念問題嗎？ 很明顯， pca.components_[i]中的組件是正確的（或至少更正確），而不是我得到的組件。 謝謝！

left：計算組件，右：真實組件

Answer 1

組件和逆變換是兩回事。 逆變換將組件映射回原始圖像空間

#Create a PCA model with two principal components
pca = PCA(2)
pca.fit(data)
#Get the components from transforming the original data.
scores = pca.transform(data)
# Reconstruct from the 2 dimensional scores 
reconstruct = pca.inverse_transform(scores )
#The residual is the amount not explained by the first two components
residual=data-reconstruct

因此，您反向轉換原始數據而不是組件，因此它們完全不同。 你幾乎從不反向轉換原始數據。 pca.components_是表示用於將數據投影到pca空間的基礎軸的實際向量。

Answer 2

抓取components_和對inverse_transform矩陣進行inverse_transform之間的區別在於后者增加了每個特征的經驗均值。 即：

def inverse_transform(self, X):
    return np.dot(X, self.components_) + self.mean_

其中self.mean_是從訓練集估計的。

使用sklearn提取PCA組件

問題描述

2 個解決方案

解決方案1
5 2014-03-02 11:31:18

解決方案2
4 已采納 2014-03-03 08:23:15

使用sklearn提取PCA組件

問題描述

2 個解決方案

解決方案1 5 2014-03-02 11:31:18

解決方案2 4 已采納 2014-03-03 08:23:15

解決方案1
5 2014-03-02 11:31:18

解決方案2
4 已采納 2014-03-03 08:23:15