简体   繁体   中英

Standardizing X different in Python Lasso and R glmnet?

I was trying to get the same result fitting lasso using Python's scikit-learn and R's glmnet. A helpful link

If I specify "normalize =True" in Python and "standardize = T" in R, they gave me the same result.

Python:

from sklearn.linear_model import Lasso
X = np.array([[1, 1, 2], [3, 4, 2], [6, 5, 2], [5, 5, 3]])
y = np.array([1, 0, 0, 1])
reg = Lasso(alpha =0.01, fit_intercept = True, normalize =True)
reg.fit(X, y)
np.hstack((reg.intercept_, reg.coef_))

Out[95]: array([-0.89607695,  0.        , -0.24743375,  1.03286824])

R:

reg_glmnet = glmnet(X, y, alpha = 1, lambda = 0.02,standardize = T)
coef(reg_glmnet)

4 x 1 sparse Matrix of class "dgCMatrix"
                    s0
(Intercept) -0.8960770
V1           .        
V2          -0.2474338
V3           1.0328682

However, if I don't want to standardize variables and set normalize =False and standardize = F, they gave me quite different results.

Python:

from sklearn.linear_model import Lasso
Z = np.array([[1, 1, 2], [3, 4, 2], [6, 5, 2], [5, 5, 3]])
y = np.array([1, 0, 0, 1])
reg = Lasso(alpha =0.01, fit_intercept = True, normalize =False)
reg.fit(Z, y)
np.hstack((reg.intercept_, reg.coef_))

Out[96]: array([-0.88      ,  0.09384212, -0.36159299,  1.05958478])

R:

reg_glmnet = glmnet(X, y, alpha = 1, lambda = 0.02,standardize = F)
coef(reg_glmnet)

4 x 1 sparse Matrix of class "dgCMatrix"
                     s0
(Intercept) -0.76000000
V1           0.04441697
V2          -0.29415542
V3           0.97623074

What's the difference between "normalize" in Python's Lasso and "standardize" in R's glmnet?

Currently, with regard to the normalize parameter the docs state "If you wish to standardize, please use StandardScaler before calling fit on an estimator with normalize=False .''

So evidently normalize and standardize are not the same with sklearn.linear_model.Lasso . Having read the StandardScaler docs I fail to understand the difference, but the fact that there is one is implied by the provided description of the normalize parameter.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM