简体   繁体   中英

Calculating CIs of fixed effects using confint in R

I would like to perform bootstrapping to obtain 95% Cis of my fixed effects in a binomial GLMM:

m <- glmer(cbind(df$Valid.detections, df$Missed.detections) ~ distance + 
              Habitat + Replicate + transmitter.depth + receiver.depth + 
              wind.speed + wtc + Transmitter + (1 | Unit) + 
              (1 | SUR.ID) + distance:Transmitter + 
              distance:Habitat + distance:transmitter.depth + distance:receiver.depth + 
              distance:wind.speed, data = df, family = binomial(link=logit),control=glmerControl(calc.derivs=F))

I found that the confint() function is able to achieve this, so I specified the function:

confint(m, method = "boot", boot.type = "basic", seed = 123, nsim = 1000)

The function had been running for more than 8 hours before I decided to terminate. Upon termination, I got returned the following warning messages (x10):

Warning messages:
1: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf,  :
  failure to converge in 10000 evaluations

My questions are: 1) Do I have to worry about these warning messages? If so, how could I deal with them?, 2) Because after 8 hours it was still running I have no clue how long it takes to perform this function. Therefore, it would be nice to have some sort of progress bar while performing this function. I read that confint() can take arguments from bootMer, so I included the argument .progress = "txt", resulting in:

confint(m, method = "boot", boot.type = "basic", seed = 123, nsim = 1000, .progress = "txt")

but it doesn't seem to work. Alternatively, if there are better ways to achieve the same goal, I'm open to suggestions.

Thanks for any help

Here's an example:

library("lme4")
(t1 <- system.time(
    gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
                 data = cbpp, family = binomial)))
##    user  system elapsed 
##   0.188   0.000   0.186

nranpars <- length(getME(gm1,"theta"))
nfixpars <- length(fixef(gm1))

(t2 <- system.time(c1 <- confint(gm1,method="boot", nsim=1000,
                  parm=(nranpars+1):(nranpars+nfixpars),
                  .progress="txt")))

##    user  system elapsed 
## 221.958   0.164 222.187

Note that this progress bar is only 80 characters long, so it increments only after each 1000/80=12 bootstrap iterations. If your original model took an hour to fit then you shouldn't expect to see any progress-bar activity until 12 hours later ...

(t3 <- system.time(c2 <- confint(gm1,
                  parm=(nranpars+1):(nranpars+nfixpars))))

##    user  system elapsed 
##   5.212   0.012   5.236 

1000 bootstrap reps is probably overkill -- if your model fit is slow, you can probably get reasonable CIs from 200 bootstrap reps.

I tried this with optimizer="nloptwrap" as well, hoping it would speed things up. It did, although there is a drawback (see below).

(t4 <- system.time(
    gm1B <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
                 data = cbpp, family = binomial, 
                 control=glmerControl(optimizer="nloptwrap"))))
##   user  system elapsed 
##  0.064   0.008   0.075 

(t5 <- system.time(c3 <- confint(gm1B,method="boot", nsim=1000,
                  parm=(nranpars+1):(nranpars+nfixpars),
                  .progress="txt",PBargs=list(style=3))))
##
##   user  system elapsed 
## 65.264   2.160  67.504

This is much faster, but gives warnings (37 in this case) about convergence. According to all.equal() , there was only about 2% difference in the confidence intervals calculated this way. (There are still some wrinkles to sort out in the package itself ...)

Your best bet for speeding this up will be to parallelize -- unfortunately, this way you lose the ability to use a progress bar ...

(t6 <- system.time(c4 <- confint(gm1,method="boot", nsim=1000,
                  parm=(nranpars+1):(nranpars+nfixpars),
                  parallel="multicore", ncpus=4)))

## 
##     user  system elapsed 
##  310.355   0.916 116.917

This takes more user time (it counts the time used on all cores), but the elapsed time is cut in half. (It would be nice to do better with 4 cores but twice as fast is still good. These are virtual cores on a virtual Linux machine, real (non-virtual) cores might have better performance.)

With nloptwrap and multicore combined I can get the time down to 91 seconds (user)/ 36 seconds (elapsed).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM