[英]R: cv.glm variable lengh differ error
I'm trying to compare backward selection vs linear regression for dimensional reduction. 我正在尝试比较向后选择与线性回归以减少尺寸。 The dataset is rather big with 150 variables.
该数据集相当大,有150个变量。
I have always used the same method to generate comparison with Cross Validation for selected models, but this time with this dataset, cv.glm gives an error that I have trouble to fix: 我一直使用相同的方法来与选定模型的交叉验证进行比较,但是这次使用此数据集,cv.glm给出了一个错误,我无法解决:
Error in model.frame.default(formula = SurveyTest$H.test ~ : variable lengths differ (found for 'Music')
model.frame.default中的错误(公式= SurveyTest $ H.test〜:可变长度不同(为“音乐”找到)
There are no NA values in SurveyTest, I can't seem to detect other causes for length difference. SurveyTest中没有NA值,我似乎无法检测到其他导致长度差异的原因。
Code for Cross Validation: 交叉验证代码:
#Linear regression full model
lm_full <- lm(SurveyTest$H.test~.,data=SurveyTest)
summary(lm_full)
#Backward selection
lm_init <- lm(H.test~1,data=SurveyTest)
backward_lm <- stepAIC(lm_full,scope = formula(lm_init),direction="backward",
trace = FALSE)
summary(backward_lm)
AIC(backward_lm)
#Cross Validation
library(boot)
model1 <- glm(lm_full)
summary(lm_full)
model2 <- glm(backward_lm)
cv.glm(data=SurveyTest, glmfit=model1,K=10)
cv.glm(data=SurveyTest, glmfit=model2,K=10)
I think I found the solution. 我想我找到了解决方案。 I should create lm_full with
我应该用创建lm_full
lm_full <- lm(H.test~.,data=SurveyTest)
instead of 代替
lm_full <- lm(SurveyTest$H.test~.,data=SurveyTest)
That solved the problem. 那解决了问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.