Why is there a difference in Linear regression fitted values and predicted values on training data?

Question

library(MASS)
data(Boston)
head(Boston)
index <- sample(nrow(Boston),nrow(Boston)*.80)
train <- Boston[index,]
test <- Boston[-index,]
model_1 <- lm(medv~.,data=train)
model_1train_p <- predict(model_1)
mean(model_1$fitted.values - model_1train_p)

Code to simulate the issue. I wanted to know why is there a non- zero difference.

Answer 1

The difference is because computers just can't handle decimal values exactly . The difference you get is very, very small, due to internal representation of any number (or any thing for that matters) as binary. It's just not always possibe to get an exact binary representation of a decimal.

If you want to check for equality of decimal numbers, use all.equal :

all.equal(model_1$fitted.values, model_1train_p)

Returns:

[1] TRUE

Why is there a difference in Linear regression fitted values and predicted values on training data?

Question

1 answers

solution1
2 2020-02-23 10:07:38

Why is there a difference in Linear regression fitted values and predicted values on training data?

Question

1 answers

solution1 2 2020-02-23 10:07:38

solution1
2 2020-02-23 10:07:38