project PCA back into original scales with explained_variance_ratio_ condition

Question

I have 2 questions concerning PCA when using scikit.

Lets suppose I have the following data:

fullmatrix =[[2.5, 2.4],
             [0.5, 0.7],
             [2.2, 2.9],
             [1.9, 2.2],
             [3.1, 3.0],
             [2.3, 2.7],
             [2.0, 1.6],
             [1.0, 1.1],
             [1.5, 1.6],
             [1.1, 0.9]]

Now I do the PCA calculations:

from sklearn.decomposition import PCA as PCA

sklearn_pca = PCA()
Y_sklearn = sklearn_pca.fit_transform(fullmatrix)
print Y_sklearn  # Y_sklearn is now the Data transformed with 2 eigenvectors

sklearn_pca.explained_variance_ratio_  # variance explained by each eigenvector
print sklearn_pca.explained_variance_ratio_

sklearn_pca.components_ # eigenvectors order by highest eigenvalue
print sklearn_pca.components_

First question: How can I project back this Y_sklearn into the original scale? (I know we should get back the same data as of full matrix as I'm using all eigenvectors, its just to check if done right).

Second question: How can I enter a threshold regarding minimum acceptable total variance coming from "sklearn_pca.explained_variance_ratio_"?. For example lets say I want to keep using eigenvectors until when i reach total explained_variance_ratio_ above 95%. In this case is easy, we just use the first eigenvector as it explains .96318131%. But how can we do this in a more automated way?

Answer 1

First: sklearn_pca.inverse_transform(Y_sklearn)

Second:

thr = 0.95
# Is cumulative sum exceeds some threshold
is_exceeds = np.cumsum(sklearn_pca.explained_variance_ratio_) >= thr
# Which minimal index provides such variance
# We need to add 1 to get minimum number of eigenvectors for saving this variance
k = np.min(np.where(is_exceeds))+1
# Or you can just initialize your model with thr parameter
sklearn_pca = PCA(n_components = thr)

project PCA back into original scales with explained_variance_ratio_ condition

Question

1 answers

solution1
1 ACCPTED 2015-10-08 18:38:08

project PCA back into original scales with explained_variance_ratio_ condition

Question

1 answers

solution1 1 ACCPTED 2015-10-08 18:38:08

solution1
1 ACCPTED 2015-10-08 18:38:08