I am trying to reconcile the confidence intervals seen on the ggplot (by using bootstrapped CI) and that when I compute CI from the lmer model. I am unsure how to calculate the CI. How would I then plot the original points, with new mean and predicted CI?
set.seed(111)
oviposition.index <- rnorm(20, 2, 1.3)
species <- rep(c("A","B"), each = 10)
month <- rep(c("Jan", "Feb"), times = 10)
plot <- rep(c("1", "2"), times = 10)
df <- data.frame(oviposition.index, species, month, plot)
mod <- lmer(oviposition.index ~ species + (1|month/plot), df)
summary(mod)
confint(mod)
Model summary and confidence intervals
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.1303 0.3684 3.5822 3.068 0.0432 *
speciesB 0.8198 0.5131 17.0000 1.598 0.1285
2.5 % 97.5 %
.sig01 0.0000000 1.130819
.sig02 0.0000000 1.130819
.sigma 0.8232305 1.540289
(Intercept) 0.3600376 1.900525
speciesB -0.1836920 1.823253
The way I see it: Species A:
Lower CI = 1.1303 - 0.3600376 = 0.7702624 (DOES NOT match graph)
Upper CI = 1.1303 + 1.900525 = 3.030825 (DOES NOT match graph)
Species B:
Lower CI = 0.8198 - (-0.1836920 )= 1.003492 (roughly matches graph)
Upper CI = 0.8198 + (1.823253) = 2.643053 (roughly matches graph)
Plot shows
ggplot(df, aes(x = species, y = oviposition.index, color = species)) + geom_point() +
geom_hline(yintercept = 1) +
stat_summary(fun.data=mean_cl_boot, geom="errorbar", width=0.2, colour="black") +
stat_summary(fun = mean, color = "black", geom ="point", size = 3,show.legend = FALSE)
The confidence intervals won't be the same because your mixed effects model has grouping variables that ggplot
's (really Hmisc
's) boot CI function doesn't have. Ultimately this leeds to the mixed effect model estimating more error in this scenario, which we see in the CIs.
That said the CIs from lmer
are close to what you have plotted already. groupA
(Intercept) is 1.1303 mu and 0.3684 se, and groupB
is ~1.94 mu (1.13 + 0.81) and more variance with 0.5131 se. I don't think your interpretation of group differences will change with either one of the CI calculations.
A few more points to add to @Nate's answer.
The idea that the bootstrap function from Hmisc
(which is what mean_cl_boot
uses) is wrong because it doesn't take the grouping structure into account is basically correct.
I modified your fitting function slightly to make it more convenient to look at the confidence intervals for species A (suppressing the intercept by including -1
in the formula. I also tried it with and without lmerTest
, for the purpose of making some comparisons discussed in more detail below.
library(lme4)
mod0 <- lmer(oviposition.index ~ species-1 + (1|month/plot), df)
library(lmerTest)
mod1 <- as(mod0, "lmerModLmerTest")
library(broom.mixed)
f <- function(m, mod = mod0, ...) {
tt <- tidy(mod, conf.int = TRUE, effects = "fixed", conf.method = m, ...)
as.data.frame(tt)[1, c("estimate", "conf.low", "conf.high")]
}
ctab <- rbind(
hmboot = Hmisc::smean.cl.boot(oviposition.index[1:10]),
hmwald = Hmisc::smean.cl.normal(oviposition.index[1:10]),
wald = f("Wald"),
wald_t_satt = f("Wald", mod1),
wald_t_kr = f("Wald", mod1, ddf.method = "Kenward-Roger"),
profile = f("profile"),
pboot = f("boot")
)
print(ctab,digits =3)
Conclusions here are that the methods all give approximately the same estimates for the CI. The naive bootstrap (as you've used above) gives the (slightly) narrowest CIs, and the Wald estimate with Kenward-Roger degrees of freedom gives the widest (probably overconservative, as the parametric bootstrap ( pboot
) probably gives the best answer). (The Satterthwaite ddf approximation completely breaks down in this example.)
estimate conf.low conf.high
hmboot 1.13 0.4397 1.68 ## naive bootstrap
hmwald 1.13 0.4005 1.86 ## naive Wald (t-distrib)
wald_lmer 1.13 0.4082 1.85 ## mixed-model Wald (Z-distrib)
wald_t_satt 1.13 NaN NaN ## mixed-model Wald (Satterthwaite)
wald_t_kr 1.13 0.0586 2.20 ## mixed-model Wald (Kenward-Roger)
profile 1.13 0.3600 1.90 ## likelihood profile CI
pboot 1.13 0.4111 1.82 ## parametric bootstrap CI
If we get a little fancier (code below) we can get CIs for both groups:
library(Hmisc)
f <- function(m, mod = mod0, w = 1:2, ...) {
tt <- tidy(mod, conf.int = TRUE, effects = "fixed", conf.method = m, ...)
tt[1:2, c("term","estimate", "conf.low", "conf.high")]
}
h <- function(sfun) {
tab <- do.call(rbind, lapply(split(df, species),
function(d) sfun(d$oviposition.index)))
tab <- data.frame(term = paste0("species", c("A","B")),
setNames(as.data.frame(tab), c("estimate", "conf.low", "conf.high")))
return(tab)
}
h(smean.cl.normal)
tab2 <- dplyr::bind_rows(list(
hmisc_boot = h(smean.cl.boot),
hmisc_normal = h(smean.cl.normal),
wald_lmer = f("Wald"),
wald_t_satt = f("Wald", mod1),
wald_t_kr = f("Wald", mod1, ddf.method = "Kenward-Roger"),
profile = f("profile"),
boot = f("boot")),
.id = "method")
tab2$method <- factor(tab2$method, levels = unique(tab2$method))
ggplot(tab2, aes(x=term, y = estimate, colour = method)) +
geom_pointrange(aes(ymin=conf.low, ymax = conf.high), position = position_dodge(width=0.25)) +
geom_hline(yintercept = 1, lty = 2)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.