在 R 中的 glmnet 中标准化 function

Question

set.seed(123)
n = 100
p = 20
x =  matrix(rnorm(n * p, mean = 2, sd = 2), n, p)
y =  rnorm(n)
lambda = 0.05

fit1 =  glmnet(x,y, lambda = lambda)
beta1 = as.vector(coef(fit1, s = lambda, exact = TRUE))
beta1[which(abs(beta1) > 0)]

xsd = apply(x, 2, function(x) (x  - mean(x))/sqrt(var(x) * (n - 1) / n))
fit2 =  glmnet(xsd,y,lambda = lambda, standardize = FALSE)
beta2 = as.vector(coef(fit2, s = lambda, exact = TRUE))
beta2[which(abs(beta2) > 0)]

est.table = data.frame("beta1" = beta1[which(abs(beta1) > 0)], "beta2" = beta2[which(abs(beta2) > 0)])

我想假设 glmnet 解决的两个套索问题的输出应该相同。 一个带有原始数据（standardize = TRUE），另一个带有标准化数据（standardize = FALSE）。 但是为什么输出完全不同。

Answer 1

当您有standardize = TRUE时，系数将以原始比例返回。 这意味着您可以将其与输入矩阵一起使用来获得预测。

如果您查看非标准化 glmnet 的输入，则输入除以标准差，这意味着您的系数将按标准差放大。

要将它们与标准化输入的回归进行比较，您需要将非标准化 glmnet 的系数除以每列的标准差：

set.seed(123)
n = 100
p = 20
x =  matrix(rnorm(n * p, mean = 2, sd = 2), n, p)
y =  rnorm(n)
lambda = c(0.01,0.05,0.1,0.5,1)

fit1 =  glmnet(x,y, lambda = lambda,standardize = TRUE)
beta1 = as.matrix(fit1$beta)

xsd = apply(x, 2, function(x) (x  - mean(x))/sqrt(var(x) * (n - 1) / n))
fit2 =  glmnet(xsd,y,lambda = lambda, standardize = FALSE)
beta2 = as.matrix(fit2$beta)

现在我们得到每一列的 sd：

colsd = apply(x, 2, function(x)sqrt(var(x) * (n - 1) / n))

我们将系数从标准化除以这个 sd：

head(sweep(beta2,1,colsd,"/"))
   s0 s1         s2           s3           s4
V1  0  0 0.00000000 -0.014049634 -0.032142780
V2  0  0 0.00000000 -0.001181405 -0.026486241
V3  0  0 0.01605406  0.051932402  0.082018905
V4  0  0 0.00000000  0.000000000  0.000000000
V5  0  0 0.00000000  0.000000000  0.004122524
V6  0  0 0.00000000  0.000000000  0.000000000

并与其他回归进行比较：

head(beta1)
   s0 s1         s2           s3           s4
V1  0  0 0.00000000 -0.014049634 -0.032142780
V2  0  0 0.00000000 -0.001181405 -0.026486241
V3  0  0 0.01605406  0.051932402  0.082018905
V4  0  0 0.00000000  0.000000000  0.000000000
V5  0  0 0.00000000  0.000000000  0.004122524
V6  0  0 0.00000000  0.000000000  0.000000000

在 R 中的 glmnet 中标准化 function

问题描述

1 个解决方案

解决方案1
0 2020-12-11 06:51:37

在 R 中的 glmnet 中标准化 function

问题描述

1 个解决方案

解决方案1 0 2020-12-11 06:51:37

解决方案1
0 2020-12-11 06:51:37