简体   繁体   中英

Confidence intervals in a ggplot2 graph and values obtained using predict function in R are not the same

This is my data (EXAMPLE):

EXAMPLE<-data.frame(
    X=c(99.6, 98.02, 96.43, 94.44, 92.06, 90.08, 87.3, 84.92, 82.14, 
79.76, 76.98, 74.21, 71.03, 67.86, 65.08, 62.3, 59.92, 56.35, 
52.38, 45.63, 41.67, 35.71, 30.95, 24.6, 17.86, 98.44, 96.48, 
94.14, 92.19, 89.84, 87.5, 84.38, 82.42, 78.52, 76.17, 73.83, 
70.7, 65.63, 62.89, 60.16, 58.2, 54.69, 52.73, 49.61, 46.09, 
42.58, 40.23, 36.72, 32.81),
    Y=c(3.62, 9.78, 15.22, 19.93, 24.64, 30.43, 35.14, 39.49, 44.93, 
49.64, 52.9, 57.97, 62.68, 66.3, 70.29, 73.55, 76.09, 78.62, 
80.8, 82.61, 84.42, 87.32, 91.67, 96.01, 99.28, 3.85, 8.55, 11.97, 
17.52, 20.94, 25.21, 29.49, 34.62, 38.89, 41.88, 46.58, 50.43, 
57.26, 63.25, 67.09, 70.09, 74.79, 79.06, 82.91, 88.03, 91.88, 
95.3, 97.86, 99.57))

I do a polynomial regression:

> LinearModel.2 <- lm(Y ~ X +I(X ^2), data=EXAMPLE)

> summary(LinearModel.2)

Call:
lm(formula = Y ~ X + I(X^2), data = CET2M3)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.3278 -4.0767  0.2222  4.7403  6.3660 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 97.041626   5.491862  17.670  < 2e-16 ***
X            0.339600   0.183034   1.855     0.07 .  
I(X^2)      -0.012709   0.001416  -8.975 1.13e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.706 on 46 degrees of freedom
"Multiple R-squared:  0.9755,   Adjusted R-squared:  0.9745 "
F-statistic:   917 on 2 and 46 DF,  p-value: < 2.2e-16

And also the confidence intervals at 95%:

> Confint(LinearModel.2, level=0.95)
               Estimate       2.5 %        97.5 %
(Intercept) 97.04162631 85.98708171 108.096170906
X            0.33959960 -0.02882900   0.708028199
I(X^2)      -0.01270946 -0.01555982  -0.009859103

Otherwise, when I plot the regression using the ggplot2 function I get the next image:

qplot(X, Y, data=EXAMPLE, geom=c("point", "smooth"), method="lm", formula= y ~ poly(x, 2))

在此处输入图片说明

Finally, when I predict the value of Y following the polynomial regression and also its confidence interval according to the next commands:

newdata50 = data.frame(X=50) 
predict(LinearModel.2,newdata50,interval="predict")

I get the next values

Fit=82,24796    2.5=72,57762    97.5=91,91829

Although the Fit value matches perfectly with what is expected in the ggplot2 graph, the confidence intervals don't.

What's wrong? Whom should I trust? Why aren't they the same?

There's a difference between a prediction interval and a confidence interval. Observe

predict(LinearModel.2,newdata50,interval="predict")
#        fit      lwr      upr
# 1 82.24791 72.58054 91.91528
predict(LinearModel.2,newdata50,interval="confidence")
#        fit      lwr      upr
# 1 82.24791 80.30089 84.19494

ggplot draws the confidence interval, not the prediction interval.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM