简体   繁体   English

R中lm对象的预测函数

[英]Predict function for lm object in R

Why are prediction_me and prediction_R not equal? 为什么prediction_meprediction_R不相等? I'm attempting to follow the formula given by Lemma 5 here . 我试图遵循引理5给出的公式在这里 Does the predict function use a different formula, have I made a mistake in my computation somewhere, or is it just rounding error? predict函数是否使用其他公式,是否在某个地方的计算中犯了错误,还是仅舍入了误差? (the two are pretty close) (两者非常接近)

set.seed(100)
# genrate data
x    <- rnorm(100, 10)
y    <- 3 + x + rnorm(100, 5)
data <- data.frame(x = x, y = y)
# fit model
mod  <- lm(y ~ x, data = data)

# new observation
data2 <- data.frame(x = rnorm(5, 10))

# prediction for new observation
d    <- as.matrix(cbind(1, data[,-2]))
d2   <- as.matrix(cbind(1, data2))
fit  <- d2 %*% mod$coefficients 
t    <- qt(1 - .025, mod$df.residual)
s    <- summary(mod)$sigma
half <- as.vector(t*s*sqrt(1 + d2%*%solve(t(d)%*%d, t(d2))))

prediction_me <- cbind(fit, fit - half, fit + half)

prediction_R <- predict(mod, newdata = data2, interval = 'prediction')


prediction_me
prediction_R

Your current code is almost fine. 您当前的代码几乎可以了。 Just note that the formula in Lemma 5 is for a single newly observed x . 只需注意引理5中的公式是针对单个新观察到的x For this reason, half contains not only relevant variances but also covariances, while you only need the former ones. 因此, half不仅包含相关方差,还包含协方差,而您只需要前一个方差。 Thus, as.vector should be replaced with diag : 因此, as.vector替换为diag

half <- diag(t * s * sqrt(1 + d2 %*% solve(t(d) %*%d , t(d2))))
prediction_me <- cbind(fit, fit - half, fit + half)
prediction_R <- predict(mod, newdata = data2, interval = 'prediction')

range(prediction_me - prediction_R)
# [1] 0 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM