简体   繁体   English

如何确定线性混合 model 在 lme4/nlme 中是否超定?

[英]How to determine if a linear mixed model is overdetermined in lme4/nlme?

In the Orthodont dataset in nlme , there are 27 subjects and each subject is measured at 4 different ages.nlme的 Orthodont 数据集中,有 27 个受试者,每个受试者在 4 个不同年龄进行测量。 I wish to use this data to explore at what condition the model will be overdetermined.我希望使用这些数据来探索 model 在什么情况下会超定。 Here are the models:以下是模型:

library(nlme)
library(lme4)

m1 <- lmer( distance ~ age + (age|Subject), data = Orthodont )
m2 <- lmer( distance ~ age + I(age^2) + (age|Subject), data = Orthodont )
m3 <- lmer( distance ~ age + I(age^2) + I(age^3) + (age|Subject), data = Orthodont )

m1nlme <- lme(distance ~ age, random = ~ age|Subject, data = Orthodont)
m2nlme <- lme(distance ~ age + I(age^2), random = ~ age|Subject, data = Orthodont)
m3nlme <- lme(distance ~ age + I(age^2) + I(age^3), random = ~ age|Subject, data = Orthodont)
m4nlme <- lme(distance ~ age + I(age^2) + I(age^3), random = ~ age + I(age^2) + I(age^3)|Subject, data = Orthodont)

Of all of the above models, only m3 throws a warning message: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:Model failed to converge with max|grad| = 0.00762984 (tol = 0.002, component 1) .在上述所有模型中,只有m3会抛出警告消息: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:Model failed to converge with max|grad| = 0.00762984 (tol = 0.002, component 1)

Questions:问题:

  1. What does the warning message suggest and if it is sensible to ignore this message?警告信息有什么建议?忽略此信息是否明智?
  2. For m2 , the model estimates fixed effect of intercept and fixed coefficient for age and I(age^2), together with the random effect parameter sigma^2_intercept, sigma^2_age, and sigma^2_intercept:age.对于m2 ,model 估计截距的固定效应和年龄和 I(age^2) 的固定系数,以及随机效应参数 sigma^2_intercept、sigma^2_age 和 sigma^2_intercept:age。 So a total of 1+2+3=6 parameters are estimated for each Subject.因此,每个 Subject 总共估计了 1+2+3=6 个参数。 But there are only 4 observations per subject.但是每个主题只有 4 个观察值。 Why does not m2 throws an error?为什么m2不抛出错误? Isn't m2 overdetermined? m2不是超定的吗? Am I counting the number of paratermeters anywhere incorrectly?我是否错误地计算了任何地方的参数数量?
  1. The warning message means that the model fit may be a bit numerically unstable;警告信息意味着 model 拟合可能在数值上有点不稳定; it is done by numerically checking the scaled gradient, but as this depends in turn on the gradient and Hessian estimated by finite differences, which are themselves subject to numerical error.它是通过数值检查缩放梯度来完成的,但这又取决于梯度和由有限差分估计的 Hessian,它们本身会受到数值误差的影响。 As I've stated in many different venues, these warnings definitely tend to be over-sensitive/likely to be false positives: see eg ?lme4::convergence , ?lme4::troubleshooting .正如我在许多不同场合所说的那样,这些警告肯定会过于敏感/可能是误报:参见例如?lme4::convergence?lme4::troubleshooting The gold standard is to use allFit() to refit the model with a variety of optimizers and make sure that the results from different optimizers are close enough to the same for your purposes .黄金标准是使用allFit()用各种优化器重新调整 model,并确保不同优化器的结果足够接近相同以满足您的目的

  2. There are two random effects values (BLUPs or conditional modes) per subject - the subject-level deviation of the intercept and slope wrt age.每个受试者有两个随机效应(BLUP 或条件模式) - 截距和斜率 wrt 年龄的受试者水平偏差。 For values, we will be in trouble if the number of values is greater than or equal to the number of observations per group (or, for GLMMs without a scale parameter such as the Poisson, if the number of values is strictly greater than the number of observations per group).对于值,如果值的数量大于或等于每组的观察数(或者,对于没有尺度参数(如泊松)的 GLMM,如果值的数量严格大于每组的观察次数)。 For parameters , there are up to four fixed-effect parameters (intercept, linear, quadratic, cubic terms wrt age) and three RE parameters (variance of the intercept, variance of the slope, covariance between intercept and slope), but these 7 parameters are estimated at the population level - the appropriate comparisons are with either the total number of observations or with the number of groups, not with the number of observations per group.对于参数,最多有四个固定效应参数(截距、线性、二次、三次项 wrt age)和三个 RE 参数(截距方差、斜率方差、截距和斜率之间的协方差),但这 7 个参数在总体水平上进行估计 - 适当的比较是与观察总数或组数,而不是每组的观察数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM