简体   繁体   中英

Does geom_smooth() of ggplot2 show pointwise confidence bands, or simultaneous confidence bands?

I was unsure whether this question would be more appropriate here or on Cross Validated. I hope I made the right choice.

Consider the example:

library(dplyr)
setosa <- iris %>% filter(Species == "setosa") %>% select(Sepal.Length, Sepal.Width, Species)
library(ggplot2)
ggplot(data = setosa, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
geom_smooth(method ="lm", formula = y ~ poly(x,2))

在此处输入图片说明 By default, ggplot "displays confidence interval around smooth" (see here ), given by the gray area around the regression curve. I've always assumed these are simultaneous confidence bands for the regression curve , not pointwise confidence bands. ggplot2 documentation refers to the predict function for details on how the standard errors are computed. However, reading the doc for predict.lm , it doesn't say explicitly that simultaneous confidence bands are computed. So, what is the correct interpretation here?

One way to check what predict.lm() computes is to inspect the code ( predict multiplies standard errors by qt((1 - level)/2, df) , and so does not appear to make adjustments for simultaneous inference). Another way is to construct simultaneous confidence intervals and compare them against predict 's intervals.

Fit the model and construct simultaneous confidence intervals:

setosa <- subset(iris, Species == "setosa")
setosa <- setosa[order(setosa$Sepal.Length), ]
fit <- lm(Sepal.Width ~ poly(Sepal.Length, 2), setosa)

K <- cbind(1, poly(setosa$Sepal.Length, 2))
cht <- multcomp::glht(fit, linfct = K)
cci <- confint(cht)

Reshape and plot:

cc <- as.data.frame(cci$confint)
cc$Sepal.Length <- setosa$Sepal.Length
cc <- reshape2::melt(cc[, 2:4], id.var = "Sepal.Length")

library(ggplot2)
ggplot(data = setosa, aes(x = Sepal.Length, y = Sepal.Width)) +
  geom_point() +
  geom_smooth(method ="lm", formula = y ~ poly(x,2)) +
  geom_line(data = cc, 
            aes(x = Sepal.Length, y = value, group = variable),
            colour = "red")

It appears that predict(.., interval = "confidence") does not produce simultaneous confidence intervals:

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM