简体   繁体   English

R:使用cv.glm计算弹性网预测误差

[英]R: Elastic net prediction error calculation using cv.glm

library(glmnet)
library(boot)
data(iris)
x <- model.matrix(Sepal.Length~., iris)[,-1]
y <- iris$Sepal.Length
m <- cv.glmnet(x, y)
> cv.glm(x, m, K = 10)
Error in UseMethod("predict") : 
  no applicable method for 'predict' applied to an object of class "c('matrix', 'double', 'numeric')"

or 要么

bestLambda = m$lambda.min
m2 <- glmnet(x, y, family = "gaussian", lambda = bestLambda)
>cv.glm(x, m2, K = 10)


 Error in glmnet(x = x, y = y, family = "gaussian", lambda = bestLambda,  : 
  unused argument (data = c(3.5, 3, 3.2, 3.1, 3.6,

In reference to this question, I'm trying to obtain the K-fold cross-validated prediction error of my Elastic net model using cv.glm , however, I can't seem to do so due to the error. 在参考这个问题时,我试图使用cv.glm获得我的弹性网模型的K-fold交叉验证预测误差,但是,由于该错误,我似乎无法这样做。 I'm not quite sure if the cv.glm function can be used to calculate the prediction error of a class cv.glm or glmnet object. 我不太确定cv.glm函数是否可用于计算类cv.glmglmnet对象的预测误差。

I think you are mixing up glm with glmnet (elastic net, with lasso & ridge penalties). 我认为你正在将glmglmnet混合(弹性网,套索和脊线惩罚)。 cv.glm expects a glm model, not a glmnet model. cv.glm需要glm模型,而不是glmnet模型。

Try either of the following: 请尝试以下任一操作:

  1. Use glmnet to compute k-fold cross-validation errors with cv.glmnet like the following: 使用glmnet来计算,其中k折交叉验证错误cv.glmnet类似如下:

     library(glmnet) library(boot) data(iris) x <- model.matrix(Sepal.Length~., iris)[,-1] y <- iris$Sepal.Length m <- cv.glmnet(x, y, nfolds=10) m$lambda.min #[1] 0.0003839539 m$lambda.1se #[1] 0.009078549 plot(m$lambda, m$cvm,type='l', xlab=expression(lambda), ylab='CV errors', main=expression(paste('CV error for different ', lambda))) lines(m$lambda, m$cvup, col='red') lines(m$lambda, m$cvlo, col='red') 

    在此输入图像描述

[EDITED] [EDITED]

prediction error on the training dataset: 训练数据集上的预测误差:

mean((y-predict(m, newx=x))^2)
# [1] 0.1037433
  1. Fit a glm model and use cv.glm to compute the cross-validation error delta (without regularization). 拟合glm模型并使用cv.glm计算交叉验证错误delta(无正则化)。 As per the documentation of cv.glm : 根据cv.glm的文档:

delta: A vector of length two. delta:长度为2的向量。 The first component is the raw cross-validation estimate of prediction error. 第一个组成部分是预测误差的原始交叉验证估计。 The second component is the adjusted cross-validation estimate. 第二个组成部分是经过调整的交叉验证估算。 The adjustment is designed to compensate for the bias introduced by not using leave-one-out cross-validation. 调整旨在弥补不使用留一交叉验证所引入的偏差。

df <- cbind.data.frame(x, y)
m <- glm(y~., df, family='gaussian')
cv.glm(df, m, K = 10)$delta 
# [1] 0.09992177 0.09940190

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM