混合 glm 零膨胀模型的 Bootstrap 方法

Question

我想使用glmmTMB包引导混合 glm 零膨胀模型（ m_F ），但是尽管使用coef或fixef来指定系数，但我总是输出错误：

Error in bres[i, ] <- coef(bfit) : 
  incorrect number of subscripts on matrix

我的例子：

library(glmmTMB)
library(boot)
my.ds <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/ds.desenvol.csv")
str(my.ds)
# 'data.frame': 400 obs. of  4 variables:
#  $ temp       : num  0 0 0 0 0 0 0 0 0 0 ...
#  $ storage    : int  5 5 5 5 5 5 5 5 5 5 ...
#  $ rep        : chr  "r1" "r2" "r3" "r4" ...
#  $ development: int  0 23 22 27 24 25 24 22 0 22 ...

# Fit a GLM mixed Hurdle (zero-inflated) log-link Gamma model
m_F <- glmmTMB(development ~ poly(temp,2) + (1 | storage), data = my.ds,
               family = ziGamma(link = "log"),
               ziformula = ~ 1)
               summary(m_F)

# Create a bootstrap aproach
nboot <- 1000
bres <- matrix(NA,nrow=nboot,
                  ncol=length(coef(m_F)),
                  dimnames=list(rep=seq(nboot),
                                coef=names(coef(m_F))))
set.seed(1000)
bootsize <- 100
for (i in seq(nboot)) {
  bdat <- my.ds[sample(nrow(my.ds),size=bootsize,replace=TRUE),]
  bfit <- update(m_F, data=bdat)  ## refit with new data
  bres[i,] <- coef(bfit)
}

请问有什么办法吗？

Answer 1

我的回答有点类似于@RuiBarradas，但更接近您的原始代码。 主要的一点是coef()并没有按照你的想法去做。 (1) 约定（最初由nlme包设置）是混合模型coef()返回组级系数的矩阵（或矩阵列表），而fixef()返回固定效应（人口级）系数； (2) 对于glmmTMB ， fixef()返回条件、零膨胀和分散模型的固定效应向量列表（ unlist()将其折叠回具有串联名称的向量）。

要记住的另一点是，对于具有分组结构的数据集（您可以在组级别或组内级别，或两者兼而有之；您可以引导残差），在单个观察级别进行引导可能不明智（如果您有线性模型 - 这不适用于具有计数数据的 GLMM）；您还可以使用lme4::bootMer进行参数引导，当您拥有具有交叉随机效应的 GLMM 时，这几乎是唯一的选择）。

PS bootsize在这里做什么？ 引导的标准方法是重新采样与原始数据集大小相同的数据集并进行替换。 仅重新采样数据集的四分之一（ nrow(my.ds) == 400 ， bootsize == 100 ）是明确定义的，但非常不寻常——您是否故意在做一些特定的非标准引导程序......？

sum_fun <- function(fit) {
    unlist(fixef(fit))
}

bres <- matrix(NA,
               nrow=nboot,
               ncol=length(sum_fun(m_F)),
               dimnames=list(rep=seq(nboot),
                             coef=names(sum_fun(m_F))))
set.seed(1000)
bootsize <- 100
pb <- txtProgressBar(max = bootsize, style = 3)
for (i in seq(nboot)) {
    setTxtProgressBar(pb, i)
    bdat <- my.ds[sample(nrow(my.ds), size=bootsize,replace=TRUE),]
    bfit <- update(m_F, data=bdat)  ## refit with new data
    bres[i,] <- sum_fun(bfit)
}

Answer 2

要使用包boot ，您必须定义一个引导数据的函数，然后从中计算统计量或统计量向量。 这是下面的函数ziboot 。 然后调用boot将数据、函数和复制次数传递给它。

该函数适合与问题代码相同的模型，但必须将模型输出转换为系数向量。 这就是lapply所做的。

library(glmmTMB)
library(boot)

my.ds <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/ds.desenvol.csv")

# Create a bootstrap aproach
# This function will be called by boot() below
ziboot <- function(data, i) {
  # this bootstraps the data
  d <- data[i, ]
  model <- glmmTMB(development ~ temp + (1 | storage), data = d,
                 family = ziGamma(link = "log"),
                 ziformula = ~ 1)
  cf <- coef(model)$cond$storage
  l <- as.list(cf)
  unlist(lapply(seq_along(l), \(i){
    x <- l[[i]]
    nms <- paste(names(l)[i], row.names(cf), sep = "_")
    setNames(x, nms)
  }))
}

set.seed(1000)
bootsize <- 100
b <- boot(my.ds, ziboot, R = bootsize)
colnames(b$t) <- names(b$t0)
head(b$t)
#>      (Intercept)_5 (Intercept)_10 (Intercept)_15 (Intercept)_20 (Intercept)_30
#> [1,]      3.156717       3.153949       3.139001       3.147799       3.196308
#> [2,]      3.172563       3.157384       3.164663       3.143005       3.196966
#> [3,]      3.175124       3.154946       3.158715       3.129027       3.168753
#> [4,]      3.149817       3.143550       3.135256       3.141367       3.167679
#> [5,]      3.159183       3.179388       3.147193       3.148219       3.237395
#> [6,]      3.148815       3.168335       3.117576       3.126973       3.178377
#>            temp_5      temp_10      temp_15      temp_20      temp_30
#> [1,] -0.004089067 -0.004089067 -0.004089067 -0.004089067 -0.004089067
#> [2,] -0.004404738 -0.004404738 -0.004404738 -0.004404738 -0.004404738
#> [3,] -0.003153053 -0.003153053 -0.003153053 -0.003153053 -0.003153053
#> [4,] -0.003547863 -0.003547863 -0.003547863 -0.003547863 -0.003547863
#> [5,] -0.003989763 -0.003989763 -0.003989763 -0.003989763 -0.003989763
#> [6,] -0.003137722 -0.003137722 -0.003137722 -0.003137722 -0.003137722

^{由reprex 包于 2022-07-05 创建 (v2.0.1)}

混合 glm 零膨胀模型的 Bootstrap 方法

问题描述

2 个解决方案

解决方案1
4 已采纳 2022-07-05 20:16:08

解决方案2
3 2022-07-05 20:10:32

混合 glm 零膨胀模型的 Bootstrap 方法

问题描述

2 个解决方案

解决方案1 4 已采纳 2022-07-05 20:16:08

解决方案2 3 2022-07-05 20:10:32

解决方案1
4 已采纳 2022-07-05 20:16:08

解决方案2
3 2022-07-05 20:10:32