简体   繁体   English

跨栏或零膨胀混合模型的置信区间

[英]Confidence intervals for hurdle or zero inflated mixed models

I want to calculate CI in mixed models, zero inflated negative binomial and hurdle model.我想在混合模型、零膨胀负二项式和障碍 model 中计算 CI。 My code for hurdle model looks like this (x1, x2 continuous, x3 categorical):我的障碍代码 model 看起来像这样(x1,x2 连续,x3 分类):

m1 <- glmmTMB(count~x1+x2+x3+(1|year/class),
          data = bd, zi = ~x2+x3+(1|year/class), family = truncated_nbinom2,
          )

I used confint , and I got these results:我使用了confint ,得到了以下结果:

ci <- confint(m1,parm="beta_")
ci
                                          2.5 %       97.5 %     Estimate
cond.(Intercept)                   1.816255e-01  0.448860094  0.285524861
cond.x1                            9.045278e-01  0.972083366  0.937697401
cond.x2                            1.505770e+01 26.817439186 20.094998772
cond.x3high                        1.190972e+00  1.492335046  1.333164894
cond.x3low                         1.028147e+00  1.215828654  1.118056377
cond.x3reg                         1.135515e+00  1.385833853  1.254445909
class:year.cond.Std.Dev.(Intercept)2.256324e+00  2.662976154  2.441845815
year.cond.Std.Dev.(Intercept)      1.051889e+00  1.523719169  1.157153015
zi.(Intercept)                     1.234418e-04  0.001309705  0.000402085
zi.x2                              2.868578e-02  0.166378014  0.069084606
zi.x3high                          8.972025e-01  1.805832900  1.272869874

Am I calculating the intervals correctly?我是否正确计算了间隔? Why is there only one category in x3 for zi?为什么 zi 在 x3 中只有一个类别? If possible, I would also like to know if it's possible to plot these CIs.如果可能的话,我还想知道这些 CI 是否可以使用 plot。

Thanks!谢谢!

Data looks like this:数据如下所示:

     class id year count  x1      x2        x3 
956    5 3002 2002      3 15.6    47.9      high
957    5 4004 2002      3 14.3    47.9      low
958    5 6021 2002      3 14.2    47.9      high
959    4 2030 2002      3 10.5    46.3      high
960    4 2031 2002      3 15.3    46.3      high
961    4 2034 2002      3 15.2    46.3      reg

with x1 and x2 continuous, x3 three level categorical variable (factor) x1 和 x2 连续,x3 三级分类变量(因子)

Summary of the model: model总结:

summary(m1)
'giveCsparse' has been deprecated; setting 'repr = "T"' for you'giveCsparse' has been deprecated; setting 'repr = "T"' for you'giveCsparse' has been deprecated; setting 'repr = "T"' for you
 Family: truncated_nbinom2  ( log )
Formula:          count ~ x1 + x2 + x3 + (1 | year/class)
Zero inflation:          ~x2 + x3 + (1 | year/class)
Data: bd

     AIC      BIC   logLik deviance df.resid 
 37359.7  37479.7 -18663.8  37327.7    13323 

Random effects:

Conditional model:
 Groups    Name        Variance Std.Dev.
 class:year(Intercept) 0.79701  0.8928  
 year      (Intercept) 0.02131  0.1460  
Number of obs: 13339, groups:  class:year, 345; year, 15

Zero-inflation model:
 Groups    Name        Variance  Std.Dev. 
 dpto:year (Intercept) 1.024e+02 1.012e+01
 year      (Intercept) 7.842e-07 8.856e-04
Number of obs: 13339, groups:  class:year, 345; year, 15

Overdispersion parameter for truncated_nbinom2 family (): 1.02 

Conditional model:
               Estimate Std. Error z value Pr(>|z|)    
(Intercept)    -1.25343    0.23081  -5.431 5.62e-08 ***
x1             -0.06433    0.01837  -3.501 0.000464 ***
x2              3.00047    0.14724  20.378  < 2e-16 ***
x3high          0.28756    0.05755   4.997 5.82e-07 ***
x3low           0.11159    0.04277   2.609 0.009083 ** 
x3reg           0.22669    0.05082   4.461 8.17e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Zero-inflation model:
               Estimate Std. Error z value Pr(>|z|)    
(Intercept)     -7.8188     0.6025 -12.977  < 2e-16 ***
x2              -2.6724     0.4484  -5.959 2.53e-09 ***
x3high           0.2413     0.1784   1.352  0.17635    
x3low           -0.1325     0.1134  -1.169  0.24258    
x3reg           -0.3806     0.1436  -2.651  0.00802 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

CI with broom.mixed CI with broom.mixed

> broom.mixed::tidy(m1, effects="fixed", conf.int=TRUE)
# A tibble: 12 x 9
   effect component term           estimate std.error statistic  p.value conf.low conf.high
   <chr>  <chr>     <chr>             <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
 1 fixed  cond      (Intercept)     -1.25      0.231      -5.43 5.62e- 8  -1.71     -0.801 
 2 fixed  cond      x1              -0.0643    0.0184     -3.50 4.64e- 4  -0.100    -0.0283
 3 fixed  cond      x2               3.00      0.147      20.4  2.60e-92   2.71      3.29  
 4 fixed  cond      x3high           0.288     0.0575      5.00 5.82e- 7   0.175     0.400 
 5 fixed  cond      x3low            0.112     0.0428      2.61 9.08e- 3   0.0278    0.195 
 6 fixed  cond      x3reg            0.227     0.0508      4.46 8.17e- 6   0.127     0.326 
 7 fixed  zi        (Intercept)     -9.88      1.32       -7.49 7.04e-14 -12.5      -7.30  
 8 fixed  zi        x1               0.214     0.120       1.79 7.38e- 2  -0.0206    0.448 
 9 fixed  zi        x2              -2.69      0.449      -6.00 2.01e- 9  -3.57     -1.81  
10 fixed  zi        x3high           0.232     0.178       1.30 1.93e- 1  -0.117     0.582 
11 fixed  zi        x3low           -0.135     0.113      -1.19 2.36e- 1  -0.357     0.0878
12 fixed  zi        x4reg           -0.382     0.144      -2.66 7.74e- 3  -0.664    -0.101 

tl;dr as far as I can tell this is a bug in confint.glmmTMB (and probably in the internal function glmmTMB:::getParms ). tl;博士据我所知,这是confint.glmmTMB中的一个错误(可能在内部 function glmmTMB:::getParms )。 In the meantime, broom.mixed::tidy(m1, effects="fixed") should do what you want.与此同时, broom.mixed::tidy(m1, effects="fixed")应该做你想做的事。 (There's now a fix in progress in the development version on GitHub, should make it to CRAN sometime? soon...) (现在在 GitHub 的开发版本中正在进行修复,应该在某个时间把它送到 CRAN 吗?很快......)

Reproducible example:可重现的例子:

set up data设置数据

set.seed(101)
n <- 1e3
bd <- data.frame(
    year=factor(sample(2002:2018, size=n, replace=TRUE)),
    class=factor(sample(1:20, size=n, replace=TRUE)),
    x1 = rnorm(n),
    x2 = rnorm(n),
    x3 = factor(sample(c("low","reg","high"), size=n, replace=TRUE),
                levels=c("low","reg","high")),
    count = rnbinom(n, mu = 3, size=1))

fit合身

library(glmmTMB)
m1 <- glmmTMB(count~x1+x2+x3+(1|year/class),
          data = bd, zi = ~x2+x3+(1|year/class), family = truncated_nbinom2,
          )

confidence intervals置信区间

confint(m1, "beta_")  ## wrong/ incomplete
broom.mixed::tidy(m1, effects="fixed", conf.int=TRUE)  ## correct

You may want to think about which kind of confidence intervals you want:您可能需要考虑需要哪种置信区间:

  • Wald CIs (default) are much faster to compute and are generally OK as long as (1) your data set is large and (2) you aren't estimating any parameters on the log/logit scale that are near the boundaries Wald CI(默认)的计算速度要快得多,并且只要(1)您的数据集很大并且(2)您没有估计任何接近边界的 log/logit 尺度上的参数,通常都可以
  • likelihood profile CIs are more accurate but much slower可能性轮廓 CI 更准确但速度较慢

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM