简体   繁体   English

glmmTMB 负二项式的置信区间 model

[英]Confidence intervals for glmmTMB negative binomial model

I am trying to calculate 95% confidence intervals for model estimates in glmmTMB (family: nbinom1).我正在尝试计算glmmTMB (系列:nbinom1)中 model 估计值的 95% 置信区间。

I am able to do this using a glmer.nb model and emmeans , using type = "response" to back-transform the estimates and confidence intervals.我可以使用glmer.nb model 和emmeans来做到这一点,使用type = "response"来反向转换估计值和置信区间。

model = glmer.nb (response ~ p1 + p2 + (1|block))
emmeans(model, ~ p1 + p2, type = "response")

I think the similar function in glmmTMB is confint(model) , but it does not back-transform the estimates.我认为 glmmTMB 中类似的glmmTMBconfint(model) ,但它不会反向转换估计值。

Can anyone help me make this work using a glmmTMB model, in the way it works for glmer ?谁能帮助我使用glmmTMB model 以glmer的工作方式完成这项工作?

This is an apples-versus-oranges situation.这是一个苹果对橘子的情况。

The emmeans call computes predictions from the model at each combination of p1 and p2 . emmeans调用计算模型对p1p2每个组合的预测。 Those can be back-transformed.那些可以反向转换。

However, confint(model) asks for inferences on the regression coefficients.但是, confint(model)要求对回归系数进行推断。 Those coefficients are in essence slopes.这些系数本质上是斜率。 They are not on the log scale like the EMMs are, and they cannot be back-transformed.它们不像 EMM 那样在对数尺度上,并且它们不能反向转换。

emmeans() should also work for glmmTMB models. emmeans()应该适用于glmmTMB模型。 For me, it works, and CI are also back-transformed ( from log scale):对我来说,它有效,而且 CI 也进行了反向转换(对数刻度):

library(glmmTMB)
data(Salamanders)

m <- glmmTMB(
  count ~ spp + mined + (1 | site),
  ziformula = ~ spp + mined,
  family = truncated_poisson,
  data = Salamanders
)

emmeans::emmeans(m, "spp", type = "response")
#>  spp    rate    SE  df lower.CL upper.CL
#>  GP    1.553 0.223 627    1.171     2.06
#>  PR    0.922 0.251 627    0.540     1.57
#>  DM    1.944 0.250 627    1.511     2.50
#>  EC-A  1.277 0.246 627    0.875     1.86
#>  EC-L  2.965 0.337 627    2.372     3.71
#>  DES-L 2.844 0.321 627    2.280     3.55
#>  DF    1.626 0.222 627    1.244     2.13
#> 
#> Results are averaged over the levels of: mined 
#> Confidence level used: 0.95 
#> Intervals are back-transformed from the log scale

Created on 2019-08-09 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2019 年 8 月 9 日创建

In case you need predicted values ( and confidence intervals) for mixed models with zero-inflation , the computation is a bit more complicated, but implemented in the ggeffects-package .如果您需要零通胀混合模型的预测值(置信区间),计算会稍微复杂一些,但在ggeffects-package 中实现 You can find more details in this vignette .您可以在此小插图中找到更多详细信息。

By default, both predict() and emmeans() do not return the most appropriate confidence intervals, as proposed by Brooks et al.默认情况下, predict()emmeans()都不会返回最合适的置信区间,正如 Brooks 等人提出的那样。 (see https://journal.r-project.org/archive/2017/RJ-2017-066/RJ-2017-066.pdf , appendix). (见https://journal.r-project.org/archive/2017/RJ-2017-066/RJ-2017-066.pdf ,附录)。

See examples here:请参阅此处的示例:

library(glmmTMB)
data(Salamanders)

m <- glmmTMB(
  count ~ spp + mined + (1 | site),
  ziformula = ~ spp + mined,
  family = truncated_poisson,
  data = Salamanders
)

# emmeans, not taking zero-inflation component into account
emmeans::emmeans(m, "spp", type = "response")
#>  spp    rate    SE  df lower.CL upper.CL
#>  GP    1.553 0.223 627    1.171     2.06
#>  PR    0.922 0.251 627    0.540     1.57
#>  DM    1.944 0.250 627    1.511     2.50
#>  EC-A  1.277 0.246 627    0.875     1.86
#>  EC-L  2.965 0.337 627    2.372     3.71
#>  DES-L 2.844 0.321 627    2.280     3.55
#>  DF    1.626 0.222 627    1.244     2.13
#> 
#> Results are averaged over the levels of: mined 
#> Confidence level used: 0.95 
#> Intervals are back-transformed from the log scale

# ggeffects with emmeans, not taking zero-inflation 
# component into account
ggeffects::ggemmeans(m, "spp", type = "fe")
#> 
#> # Predicted counts of count
#> # x = spp
#> 
#>      x predicted conf.low conf.high
#>     GP     1.553    1.171     2.059
#>     PR     0.922    0.540     1.574
#>     DM     1.944    1.511     2.502
#>   EC-A     1.277    0.875     1.863
#>   EC-L     2.965    2.372     3.707
#>  DES-L     2.844    2.280     3.549
#>     DF     1.626    1.244     2.126

# ggeffects with emmeans, taking zero-inflation 
# component into account
ggeffects::ggemmeans(m, "spp", type = "fe.zi")
#> 
#> # Predicted counts of count
#> # x = spp
#> 
#>      x predicted conf.low conf.high
#>     GP     0.567    0.483     0.651
#>     PR     0.089    0.072     0.107
#>     DM     0.911    0.773     1.048
#>   EC-A     0.204    0.167     0.242
#>   EC-L     1.389    1.172     1.606
#>  DES-L     1.506    1.268     1.744
#>     DF     0.762    0.637     0.886

Created on 2019-08-09 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2019 年 8 月 9 日创建

The observed mean of a zero-inflated model is a product of two components: a mean structure and a probability of excess zero, each estimated on the link scale, typically giving y = exp(log_mu)*(1 - plogis(log_zi)) .观察到的零膨胀均值 model 是两个分量的乘积:均值结构和超零概率,每个分量都在链接尺度上估计,通常给出y = exp(log_mu)*(1 - plogis(log_zi)) . Although log_mu and log_zi are each assumed to be a liner combination of coefficients with uncertainties following normal distributions, the distribution of y as a nonlinear function of these two components are unknown.虽然log_mulog_zi都被假定为服从正态分布的具有不确定性的系数的线性组合,但是作为这两个分量的非线性 function 的y分布是未知的。 Therefore, confidence intervals for predicted means in zero-inflated models are usually built via simulation.因此,零膨胀模型中预测均值的置信区间通常是通过模拟建立的。

I noticed that ggeffects::ggpredict(type = "fe.zi") and marginaleffects:predictions() are for zero inflated count models are based on a simulation of two separate draws of 1000 random observations for the mean structure and the excess zero probability, respectively, from multivariate normal distributions following (Brooks et al., 2017, pp 391-392).我注意到ggeffects::ggpredict(type = "fe.zi")和 margineleffects marginaleffects:predictions()用于零膨胀计数模型基于对平均结构和超零概率的 1000 个随机观察的两次单独抽取的模拟,分别来自以下多元正态分布(Brooks 等人,2017 年,第 391-392 页)。 Therefore, they vary per run.因此,它们每次运行都不同。 I have not found ways to control the number of draws in either functions, but setting the seed status using set.seed() will generate reproducible results.我还没有找到控制这两个函数中绘制次数的方法,但是使用set.seed()设置种子状态将生成可重现的结果。 Somehow the confidence intervals in most cases are symmetrical around the mean in my case study.在我的案例研究中,大多数情况下的置信区间以某种方式围绕均值对称。 On the other hand, ggeffects::ggemmeans() does not vary per run, which I have not discovered how to explain.另一方面, ggeffects::ggemmeans()每次运行都没有变化,我还没有发现如何解释。 I suspect that ggemmeans() is based on observed distribution within the sample, so exp(log_mu)*(1 - plogis(log_zi)) is calculated for each observation.我怀疑 ggemmeans() 是基于样本内观察到的分布,所以exp(log_mu)*(1 - plogis(log_zi))是为每个观察计算的。 It also produces symmetrical confidence intervals.它还会产生对称的置信区间。 These three methods give standard errors on the response scale.这三种方法给出了响应量表的标准误差。 ggeffects::ggeffect() only generate predictions and confidence intervals for the mean structure without considering zero inflation probabilities regardless of the type = argument, and it thus can be misleading. ggeffects::ggeffect()仅生成均值结构的预测和置信区间,而不考虑零通胀概率,无论 type = 参数如何,因此可能会产生误导。

For count models without zero inflation, ggeffects::ggpredict(type = "fixed") , ggeffects::ggemmeans() , and marginaleffects:predictions() build confidence intervals of y = exp(log_mu) based on normal approximation of log_mu .对于没有零通货膨胀的计数模型, ggeffects::ggpredict(type = "fixed")ggeffects::ggemmeans()和 margineleffects marginaleffects:predictions() ) 基于 log_mu 的正态近似建立y = exp(log_mu) log_mu置信区间。 Therefore, CI of y is approximately exp(log_mu ± 1.96 * SE(log_mu)) which is asymmetrical around mu.因此,y 的 CI 大约为exp(log_mu ± 1.96 * SE(log_mu)) ,它在 mu 周围是不对称的。 This holds true as long as the distributional assumption of response is in the exponential family, which includes Poisson, conventional negative binomial (NB-2), and other negative binomial variants (eg NB-1, NB-heterogeneous).只要响应的分布假设在指数族中,这就成立,指数族包括泊松、常规负二项式 (NB-2) 和其他负二项式变体(例如 NB-1、NB-异质)。 Note that the {ggeffects} package gives standard errors on the link scale (log for count models), whereas the {predictions} package gives standard errors on the response scale.请注意,{ggeffects} package 给出了链接量表的标准误差(计数模型的对数),而 {predictions} package 给出了响应量表的标准误差。 The relationship between these two is simple: SE(response) = y*SE(link) .这两者之间的关系很简单: SE(response) = y*SE(link)

Brooks, M., E., Kristensen, K., Benthem, K., J. ,van, Magnusson, A., Berg, C., W., Nielsen, A., Skaug, H., J., Mächler, M., & Bolker, B., M. (2017). Brooks, M., E., Kristensen, K., Benthem, K., J.,van, Magnusson, A., Berg, C., W., Nielsen, A., Skaug, H., J., Mächler , M., & Bolker, B., M. (2017)。 GlmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. GlmmTMB 平衡了零膨胀广义线性混合建模包之间的速度和灵活性。 The R Journal, 9(2), 378. https://doi.org/10.32614/RJ-2017-066 R 期刊,9(2), 378。https://doi.org/10.32614/RJ-2017-066

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM