简体   繁体   中英

sklearn PLSRegression - Variance of X explained by latent vectors

I performed a partial least squares regression using Python's sklearn.cross_decomposition.PLSRegression

Is there a way to retrieve the fraction of explained variance for X, ie R 2 (X) , for each PLS component? I'm looking for something similar to the explvar() function from the R pls package. However, I'd also appreciate any suggestions on how to compute it myself.

There is a similar question and there is one answer that explains how to get the variance of Y. I guess, that "variance in Y" is what was asked for in that case. That's why I opened a new question - hope that's OK

I managed to find a solution for the problem. The following gives the fraction of variance in X explained by each latent vector after PLS regression:

import numpy as np
from sklearn import cross_decomposition

# X is a numpy ndarray with samples in rows and predictor variables in columns
# y is one-dimensional ndarray containing the response variable

total_variance_in_x = np.var(X, axis = 0)

pls1 = cross_decomposition.PLSRegression(n_components = 5)
pls1.fit(X, y) 

# variance in transformed X data for each latent vector:
variance_in_x = np.var(pls1.x_scores_, axis = 0) 

# normalize variance by total variance:
fractions_of_explained_variance = variance_in_x / total_variance_in_x

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM