简体   繁体   中英

Why is there a difference in Linear regression fitted values and predicted values on training data?

library(MASS)
data(Boston)
head(Boston)
index <- sample(nrow(Boston),nrow(Boston)*.80)
train <- Boston[index,]
test <- Boston[-index,]
model_1 <- lm(medv~.,data=train)
model_1train_p <- predict(model_1)
mean(model_1$fitted.values - model_1train_p)

Code to simulate the issue. I wanted to know why is there a non- zero difference.

The difference is because computers just can't handle decimal values exactly . The difference you get is very, very small, due to internal representation of any number (or any thing for that matters) as binary. It's just not always possibe to get an exact binary representation of a decimal.

If you want to check for equality of decimal numbers, use all.equal :

all.equal(model_1$fitted.values, model_1train_p)

Returns:

[1] TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM