混合 glm 零膨脹模型的 Bootstrap 方法

Question

我想使用glmmTMB包引導混合 glm 零膨脹模型（ m_F ），但是盡管使用coef或fixef來指定系數，但我總是輸出錯誤：

Error in bres[i, ] <- coef(bfit) : 
  incorrect number of subscripts on matrix

我的例子：

library(glmmTMB)
library(boot)
my.ds <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/ds.desenvol.csv")
str(my.ds)
# 'data.frame': 400 obs. of  4 variables:
#  $ temp       : num  0 0 0 0 0 0 0 0 0 0 ...
#  $ storage    : int  5 5 5 5 5 5 5 5 5 5 ...
#  $ rep        : chr  "r1" "r2" "r3" "r4" ...
#  $ development: int  0 23 22 27 24 25 24 22 0 22 ...

# Fit a GLM mixed Hurdle (zero-inflated) log-link Gamma model
m_F <- glmmTMB(development ~ poly(temp,2) + (1 | storage), data = my.ds,
               family = ziGamma(link = "log"),
               ziformula = ~ 1)
               summary(m_F)

# Create a bootstrap aproach
nboot <- 1000
bres <- matrix(NA,nrow=nboot,
                  ncol=length(coef(m_F)),
                  dimnames=list(rep=seq(nboot),
                                coef=names(coef(m_F))))
set.seed(1000)
bootsize <- 100
for (i in seq(nboot)) {
  bdat <- my.ds[sample(nrow(my.ds),size=bootsize,replace=TRUE),]
  bfit <- update(m_F, data=bdat)  ## refit with new data
  bres[i,] <- coef(bfit)
}

請問有什么辦法嗎？

Answer 1

我的回答有點類似於@RuiBarradas，但更接近您的原始代碼。 主要的一點是coef()並沒有按照你的想法去做。 (1) 約定（最初由nlme包設置）是混合模型coef()返回組級系數的矩陣（或矩陣列表），而fixef()返回固定效應（人口級）系數； (2) 對於glmmTMB ， fixef()返回條件、零膨脹和分散模型的固定效應向量列表（ unlist()將其折疊回具有串聯名稱的向量）。

要記住的另一點是，對於具有分組結構的數據集（您可以在組級別或組內級別，或兩者兼而有之；您可以引導殘差），在單個觀察級別進行引導可能不明智（如果您有線性模型 - 這不適用於具有計數數據的 GLMM）；您還可以使用lme4::bootMer進行參數引導，當您擁有具有交叉隨機效應的 GLMM 時，這幾乎是唯一的選擇）。

PS bootsize在這里做什么？ 引導的標准方法是重新采樣與原始數據集大小相同的數據集並進行替換。 僅重新采樣數據集的四分之一（ nrow(my.ds) == 400 ， bootsize == 100 ）是明確定義的，但非常不尋常——您是否故意在做一些特定的非標准引導程序......？

sum_fun <- function(fit) {
    unlist(fixef(fit))
}

bres <- matrix(NA,
               nrow=nboot,
               ncol=length(sum_fun(m_F)),
               dimnames=list(rep=seq(nboot),
                             coef=names(sum_fun(m_F))))
set.seed(1000)
bootsize <- 100
pb <- txtProgressBar(max = bootsize, style = 3)
for (i in seq(nboot)) {
    setTxtProgressBar(pb, i)
    bdat <- my.ds[sample(nrow(my.ds), size=bootsize,replace=TRUE),]
    bfit <- update(m_F, data=bdat)  ## refit with new data
    bres[i,] <- sum_fun(bfit)
}

Answer 2

要使用包boot ，您必須定義一個引導數據的函數，然后從中計算統計量或統計量向量。 這是下面的函數ziboot 。 然后調用boot將數據、函數和復制次數傳遞給它。

該函數適合與問題代碼相同的模型，但必須將模型輸出轉換為系數向量。 這就是lapply所做的。

library(glmmTMB)
library(boot)

my.ds <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/ds.desenvol.csv")

# Create a bootstrap aproach
# This function will be called by boot() below
ziboot <- function(data, i) {
  # this bootstraps the data
  d <- data[i, ]
  model <- glmmTMB(development ~ temp + (1 | storage), data = d,
                 family = ziGamma(link = "log"),
                 ziformula = ~ 1)
  cf <- coef(model)$cond$storage
  l <- as.list(cf)
  unlist(lapply(seq_along(l), \(i){
    x <- l[[i]]
    nms <- paste(names(l)[i], row.names(cf), sep = "_")
    setNames(x, nms)
  }))
}

set.seed(1000)
bootsize <- 100
b <- boot(my.ds, ziboot, R = bootsize)
colnames(b$t) <- names(b$t0)
head(b$t)
#>      (Intercept)_5 (Intercept)_10 (Intercept)_15 (Intercept)_20 (Intercept)_30
#> [1,]      3.156717       3.153949       3.139001       3.147799       3.196308
#> [2,]      3.172563       3.157384       3.164663       3.143005       3.196966
#> [3,]      3.175124       3.154946       3.158715       3.129027       3.168753
#> [4,]      3.149817       3.143550       3.135256       3.141367       3.167679
#> [5,]      3.159183       3.179388       3.147193       3.148219       3.237395
#> [6,]      3.148815       3.168335       3.117576       3.126973       3.178377
#>            temp_5      temp_10      temp_15      temp_20      temp_30
#> [1,] -0.004089067 -0.004089067 -0.004089067 -0.004089067 -0.004089067
#> [2,] -0.004404738 -0.004404738 -0.004404738 -0.004404738 -0.004404738
#> [3,] -0.003153053 -0.003153053 -0.003153053 -0.003153053 -0.003153053
#> [4,] -0.003547863 -0.003547863 -0.003547863 -0.003547863 -0.003547863
#> [5,] -0.003989763 -0.003989763 -0.003989763 -0.003989763 -0.003989763
#> [6,] -0.003137722 -0.003137722 -0.003137722 -0.003137722 -0.003137722

^{由reprex 包於 2022-07-05 創建 (v2.0.1)}

混合 glm 零膨脹模型的 Bootstrap 方法

問題描述

2 個解決方案

解決方案1
4 已采納 2022-07-05 20:16:08

解決方案2
3 2022-07-05 20:10:32

混合 glm 零膨脹模型的 Bootstrap 方法

問題描述

2 個解決方案

解決方案1 4 已采納 2022-07-05 20:16:08

解決方案2 3 2022-07-05 20:10:32

解決方案1
4 已采納 2022-07-05 20:16:08

解決方案2
3 2022-07-05 20:10:32