简体   繁体   English

R:cv.glm变量长度不同错误

[英]R: cv.glm variable lengh differ error

I'm trying to compare backward selection vs linear regression for dimensional reduction. 我正在尝试比较向后选择与线性回归以减少尺寸。 The dataset is rather big with 150 variables. 该数据集相当大,有150个变量。

I have always used the same method to generate comparison with Cross Validation for selected models, but this time with this dataset, cv.glm gives an error that I have trouble to fix: 我一直使用相同的方法来与选定模型的交叉验证进行比较,但是这次使用此数据集,cv.glm给出了一个错误,我无法解决:

Error in model.frame.default(formula = SurveyTest$H.test ~ : variable lengths differ (found for 'Music') model.frame.default中的错误(公式= SurveyTest $ H.test〜:可变长度不同(为“音乐”找到)

There are no NA values in SurveyTest, I can't seem to detect other causes for length difference. SurveyTest中没有NA值,我似乎无法检测到其他导致长度差异的原因。

Code for Cross Validation: 交叉验证代码:

#Linear regression full model
lm_full <- lm(SurveyTest$H.test~.,data=SurveyTest)
summary(lm_full)

#Backward selection
lm_init <- lm(H.test~1,data=SurveyTest)
backward_lm <- stepAIC(lm_full,scope = formula(lm_init),direction="backward", 
trace = FALSE)
summary(backward_lm)
AIC(backward_lm)

#Cross Validation
library(boot)
model1 <- glm(lm_full)
summary(lm_full)
model2 <- glm(backward_lm)
cv.glm(data=SurveyTest, glmfit=model1,K=10)
cv.glm(data=SurveyTest, glmfit=model2,K=10)

I think I found the solution. 我想我找到了解决方案。 I should create lm_full with 我应该用创建lm_full

lm_full <- lm(H.test~.,data=SurveyTest)

instead of 代替

lm_full <- lm(SurveyTest$H.test~.,data=SurveyTest)

That solved the problem. 那解决了问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM