简体   繁体   English

在 R 中的逻辑回归模型中获取测试错误

[英]Get test error in a logistic regression model in R

I'm performing some experiments with logistic regression in R with the Auto dataset included in R.我正在使用 R 中包含的Auto数据集在 R 中进行逻辑回归实验。

I've get the training part (80%) and the test part (20%) normalizing each part individually.我已经得到训练部分 (80%) 和测试部分 (20%) 分别标准化每个部分。

I can create the model without any problem with the line:我可以在没有任何问题的情况下创建模型:

mlr<-glm(mpg ~ 
displacement + horsepower + weight, data =train)

I can even predict train$mpg with the train set:我什train$mpg可以用火车集预测train$mpg

trainpred<-predict(mlr,train,type="response")

And with this calculate the sample error:并以此计算样本误差:

etab <- table(trainpred, train[,1])
insampleerror<-sum(diag(etab))/sum(etab)

The problem comes when I want predict with the test set.当我想用测试集进行预测时,问题就来了。 I use the following line:我使用以下行:

testpred<-predict(model_rl,test,type="response")

Which gives me this warning:这给了我这个警告:

'newdata' had 79 rows but variables found have 313 rows 'newdata' 有 79 行,但发现的变量有 313 行

but it doesn't work, because testpred have the same length of trainpred (should be less).但它不起作用,因为testpred具有相同长度的trainpred (应该更少)。 When I want calculate the error in test using testpred with the following line:当我想使用testpred和以下行计算测试中的错误时:

etabtest <- table(testpred, test[,1])

I get the following error:我收到以下错误:

Error en table(testpred, test[, 1]) :错误 en table(testpred, test[, 1]) :
all arguments must have the same length所有参数必须具有相同的长度

What I'm doing wrong?我做错了什么?

I response my own question if someone have the same problem:如果有人有同样的问题,我会回答我自己的问题:

When I put the arguments in glm I'm saying what I want to predict, this is Auto$mpg labels with train data, hence, my glm call must be:当我将参数放入glm我说的是我想预测的内容,这是带有train数据的 Auto$mpg 标签,因此,我的glm调用必须是:

attach(Auto)
mlr<-glm(mpg ~ 
displacement + horsepower + weight, data=Auto, subset=indexes_train)

If now I call predict , table , etc there isn't any problem of structures sizes.如果现在我调用predicttable等,则结构大小没有任何问题。 Modifying this mistake it works for me.修改这个错误对我有用。

As imo says: "More importantly, you might check that this creates a logistic regression. I think it is actually OLS. You have to set the link and family arguments."正如 imo 所说:“更重要的是,您可能会检查这是否会创建逻辑回归。我认为它实际上是 OLS。您必须设置链接和家庭参数。”

set familiy = 'binomial'设置家庭 = '二项式'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM