简体   繁体   中英

Should I normalize the dependent variable in a penalized linear regression model?

When I compute penalized regression on the data without normalizing using the glmnet package in R, the lambda values and RMSE generated in lasso, ridge, and elastic net are unreasonably large. The RMSE generated is in thousands. However, when I normalize the response variable, I see that the lambda, RMSE, and R^2 values are all within reasonable range, < 1. Are we supposed to normalize the response variable? I tried scaling the predictor variables, but it still generates large values for lambda, RMSE, and R^2. The response variable is numeric, number of shares of online articles and the values range from 1-840,000 with a mean of 3500.

RMSE is a measure of measure of the prediction error in the model , which is on the same scale as your dependent variable, and can only be interpreted in relation to that scale. You can get a lower RMSE by rescaling your dependent variable, but in the case of a linear scaling it would only be because you change the scale of the dependent variable. An RMSE in the thousands does not seem unreasonable if your values are in the range 1-840,000 with a mean of 3500.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM