简体   繁体   中英

How to compare a model with no random effects to a model with a random effect using lme4?

I can use gls() from the nlme package to build mod1 with no random effects. I can then compare mod1 using AIC to mod2 built using lme() which does include a random effect.

mod1 = gls(response ~ fixed1 + fixed2, method="REML", data)
mod2 = lme(response ~ fixed1 + fixed2, random = ~1 | random1, method="REML",data)
AIC(mod1,mod2)

Is there something similar to gls() for the lme4 package which would allow me to build mod3 with no random effects and compare it to mod4 built using lmer() which does include a random effect?

mod3 = ???(response ~ fixed1 + fixed2, REML=T, data)
mod4 = lmer(response ~ fixed1 + fixed2 + (1|random1), REML=T, data)
AIC(mod3,mod4)

With modern (>1.0) versions of lme4 you can make a direct comparison between lmer fits and the corresponding lm model, but you have to use ML --- it's hard to come up with a sensible analogue of the "REML criterion" for a model without random effects (because it would involve a linear transformation of the data that set all of the fixed effects to zero ...)

You should be aware that there are theoretical issues with information-theoretic comparisons between models with and without variance components: see the GLMM FAQ for more information.

library(lme4)
fm1 <- lmer(Reaction~Days+(1|Subject),sleepstudy, REML=FALSE)
fm0 <- lm(Reaction~Days,sleepstudy)
AIC(fm1,fm0)
##     df      AIC
## fm1  4 1802.079
## fm0  3 1906.293

I prefer output in this format (delta-AIC rather than raw AIC values):

bbmle::AICtab(fm1,fm0)
##     dAIC  df
## fm1   0.0 4 
## fm0 104.2 3 

To test, let's simulate data with no random effect (I had to try a couple of random-number seeds to get an example where the among-subject std dev was actually estimated as zero):

rr <- simulate(~Days+(1|Subject),
               newparams=list(theta=0,beta=fixef(fm1),
                         sigma=sigma(fm1)),
               newdata=sleepstudy,
               family="gaussian",
               seed=103)[[1]]
ss <- transform(sleepstudy,Reaction=rr)
fm1Z <- update(fm1,data=ss)
VarCorr(fm1Z)
##  Groups   Name        Std.Dev.
##  Subject  (Intercept)  0.000  
##  Residual             29.241
fm0Z <- update(fm0,data=ss)
all.equal(c(logLik(fm0Z)),c(logLik(fm1Z)))  ## TRUE

While I agree that with Ben that the simplest solution is to set REML=FALSE, the maximum REML likelihood for a model without random effects is well defined and is fairly straightforward to compute via the well known relation

在此处输入图片说明

between the ordinary profile likelihood function and the restricted likelihood.

The following code simulates data for which the estimated variance of the random intercept of a LMM ends up at 0 such that the maximum restricted log likelihood of the LMM should be equal to the restricted likelihood of the model without any random effects included.

The restricted likelihood of is computed via the above formula.

The two restricted log likelihoods both evaluate to -147.8086.

library(lme4)

# simulate some toy data for which the LMM ends up at the boundary
set.seed(5)
n <- 100 # the sample size
x <- rnorm(n) 
y <- rnorm(n)
group <- factor(rep(1:10,10))

# fit the LMM via REML
mod1 <- lmer(y ~ x + (1|group), REML=TRUE, control=lmerControl(boundary.tol=1e-8))
logLik(mod1)

# fit a model without random effects and compute its maximum REML log likelihood
mod0 <- lm(y ~ x)
p <- length(coef(mod0)) # number of fixed effect parameters
X <- model.matrix(mod0) # the fixed effect design matrix
sigma.REML <- summary(mod0)$sigma # REMLE of sigma
# the maximum ordinary log likelihood evaluated at the REML estimates
logLik.lm.at.REML <- sum(dnorm(residuals(mod0), 0, sigma.REML, log=TRUE))
# the restricted log likelihood of the model without random effects (via above formula)
logLik.lm.at.REML + p/2*log(2*pi) - 1/2*(- p*log(sigma.REML^2) + determinant(crossprod(X))$modulus)

Alternatively, just use glmmTMB

> library(glmmTMB)
> data <- data.frame(y,x,group)
> logLik(glmmTMB(y~x, family = gaussian(), data=data, REML=TRUE))
'log Lik.' -147.8086 (df=3)
> logLik(glmmTMB(y~x+(1|group), family = gaussian(), data=data, REML=TRUE))
'log Lik.' -147.8086 (df=4)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM