简体   繁体   English

R中多重估算数据集的多级回归模型(Amelia,zelig,lme4)

[英]Multi-level regression model on multiply imputed data set in R (Amelia, zelig, lme4)

I am trying to run a multi-level model on multiply imputed data (created with Amelia); 我试图在多重估算数据上运行一个多级模型(用Amelia创建); the sample is based on a clustered sample with group = 24, N= 150. 样本基于群集样本,群组= 24,N = 150。

library("ZeligMultilevel")
ML.model.0 <- zelig(dv~1 + tag(1|group), model="ls.mixed",
data=a.out$imputations)
summary(ML.model.0)

This code produces the following error code: 此代码生成以下错误代码:

Error in object[[1]]$result$call : 
$ operator not defined for this S4 class

If I run a OLS regression, it works: 如果我运行OLS回归,它可以工作:

model.0 <- zelig(dv~1, model="ls", data=a.out$imputations)
m.0 <- coef(summary(model.0)) 
print(m.0, digits = 2)

      Value Std. Error t-stat  p-value
[1,]    45       0.34    130 2.6e-285

I am happy to provide a working example . 我很高兴提供一个有效的例子

require(Zelig)
require(Amelia)
require(ZeligMultilevel)

data(freetrade)
length(freetrade$country) #grouping variable

#Imputation of missing data

a.out <- amelia(freetrade, m=5, ts="year", cs="country")

# Models: (1) OLS; (2) multi-level 

model.0 <- zelig(polity~1, model="ls", data=a.out$imputations)
m.0 <- coef(summary(model.0)) 
print(m.0, digits = 2)

ML.model.0 <- zelig(polity~1 + tag(1|country), model="ls.mixed", data=a.out$imputations)
summary(ML.model.0)

I think the issue may be with how Zelig interfaces with Amelia's mi class. 我认为这个问题可能与Zelig如何与Amelia的mi类接口有关。 Therefore, I turned toward an alternative R package: lme4. 因此,我转向另一种R包:lme4。

require(lme4)
write.amelia(obj=a.out, file.stem="inmi", format="csv", na="NA")
diff <-list(5)  # a list to store each model, 5 is the number of the imputed datasets

for (i in 1:5) {
file.name <- paste("inmi", 5 ,".csv",sep="")
data.to.use <- read.csv(file.name)
diff[[5]] <- lmer(polity ~ 1 + (1 | country),
data = data.to.use)}
diff

The result is the following: 结果如下:

[[1]]
[1] 5

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
Linear mixed model fit by REML 
Formula: polity ~ 1 + (1 | country) 
   Data: data.to.use 
  AIC  BIC logLik deviance REMLdev
 1006 1015 -499.9     1002   999.9
Random effects:
 Groups   Name        Variance Std.Dev.
 country  (Intercept) 14.609   3.8222  
 Residual             17.839   4.2236  
Number of obs: 171, groups: country, 9

Fixed effects:
            Estimate Std. Error t value
(Intercept)    2.878      1.314    2.19

The results remain the same when I replace diff[[5]] by diff[[4]] , diff[[3]] etc. Still, I am wondering whether this is actually the results for the combined dataset or for one single imputed data set. 当我用diff[[5]] diff[[4]]diff[[3]]等替换diff[[5]]时,结果保持不变。不过,我想知道这是否实际上是组合数据集或单个估算的结果数据集。 Any thoughts? 有什么想法吗? Thanks! 谢谢!

I modified the summary function for this object (fetched the source and opened up ./R/summary.R file). 我修改了此对象的摘要函数(获取源并打开./R/summary.R文件)。 I added some curly braces to make the code flow and changed a getcoef to coef . 我添加了一些花括号来使代码流动并将getcoef更改为coef This should work for this particular case, but I'm not sure if it's general. 这应该适用于这种特殊情况,但我不确定它是否一般。 Function getcoef searches for slot coef3 , and I have never seen this. 函数getcoef搜索slot coef3 ,我从未见过这个。 Perhaps @BenBolker can throw an eye here? 也许@BenBolker可以在这里睁眼? I can't guarantee this is what the result looks like, but the output looks legit to me. 我不能保证这是结果的样子,但输出对我来说是合法的。 Perhaps you could contact the package authors to correct this in the future version. 也许您可以联系包作者以在将来的版本中更正此问题。

summary(ML.model.0) 摘要(ML.model.0)

  Model: ls.mixed
  Number of multiply imputed data sets: 5 

Combined results:

Call:
zelig(formula = polity ~ 1 + tag(1 | country), model = "ls.mixed", 
    data = a.out$imputations)

Coefficients:
        Value Std. Error   t-stat    p-value
[1,] 2.902863   1.311427 2.213515 0.02686218

For combined results from datasets i to j, use summary(x, subset = i:j).
For separate results, use print(summary(x), subset = i:j).

Modified function: 修改功能:

summary.MI <- function (object, subset = NULL, ...) {
  if (length(object) == 0) {
    stop('Invalid input for "subset"')
  } else {
    if (length(object) == 1) {
      return(summary(object[[1]]))
    }
  }

  # Roman: This function isn't fecthing coefficients robustly. Something goes wrong. Contact package author. 
  getcoef <- function(obj) {
    # S4
    if (!isS4(obj)) {
      coef(obj)
    } else {
      if ("coef3" %in% slotNames(obj)) {
        obj@coef3
      } else {
        obj@coef
      }
    }
  }

    #
    res <- list()

    # Get indices
    subset <- if (is.null(subset)) {
      1:length(object)
    } else {
      c(subset)
    }

    # Compute the summary of all objects
    for (k in subset) {
      res[[k]] <- summary(object[[k]])
    }


    # Answer
    ans <- list(
      zelig = object[[1]]$name,
      call = object[[1]]$result@call,
      all = res
    )

    #
    coef1 <- se1 <- NULL

    #
    for (k in subset) {
#       tmp <-  getcoef(res[[k]]) # Roman: I changed this to coef, not 100% sure if the output is the same
      tmp <- coef(res[[k]])
      coef1 <- cbind(coef1, tmp[, 1])
      se1 <- cbind(se1, tmp[, 2])
    }

    rows <- nrow(coef1)
    Q <- apply(coef1, 1, mean)
    U <- apply(se1^2, 1, mean)
    B <- apply((coef1-Q)^2, 1, sum)/(length(subset)-1)
    var <- U+(1+1/length(subset))*B
    nu <- (length(subset)-1)*(1+U/((1+1/length(subset))*B))^2

    coef.table <- matrix(NA, nrow = rows, ncol = 4)
    dimnames(coef.table) <- list(rownames(coef1),
                                 c("Value", "Std. Error", "t-stat", "p-value"))
    coef.table[,1] <- Q
    coef.table[,2] <- sqrt(var)
    coef.table[,3] <- Q/sqrt(var)
    coef.table[,4] <- pt(abs(Q/sqrt(var)), df=nu, lower.tail=F)*2
    ans$coefficients <- coef.table
    ans$cov.scaled <- ans$cov.unscaled <- NULL

    for (i in 1:length(ans)) {
      if (is.numeric(ans[[i]]) && !names(ans)[i] %in% c("coefficients")) {
        tmp <- NULL
        for (j in subset) {
          r <- res[[j]]
          tmp <- cbind(tmp, r[[pmatch(names(ans)[i], names(res[[j]]))]])
        }
        ans[[i]] <- apply(tmp, 1, mean)
      }
    }

    class(ans) <- "summaryMI"
    ans
  }

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用Amelia II / mitools / Zelig和lme4分析R中混合级别模型的估算数据 - analyzing imputed data for mixed-level models in R with Amelia II / mitools / Zelig and lme4 Zelig和Amelia的估算数据汇总统计 - Summary statistics for imputed data from Zelig & Amelia 使用来自库 mouse() 的估算数据集来拟合 R 中的多级模型 - Using imputed datasets from library mice() to fit a multi-level model in R R中混合模型(lme4)中的单级变量错误 - single level variables in mixed model (lme4) error in R 如何在zelig中获取模型推导的拟合值(AIC,F统计量)以用于多次插补数据? - How to get measures of model fit (AIC, F-statistics) in zelig for multiply imputed data? 将Zelig“ sim”函数与Amelia数据集结合使用,以获取跨R中的估算数据集汇总的估计值 - Using Zelig “sim” function with Amelia dataset to obtain estimates pooled across imputed datasets in R 在 R 中使用 lme4 重复测量数据的多级模型 - Multilevel model for repeated measures data using lme4 in R 在 R 中不使用 lme4 package 估计混合级逻辑回归系数 - Estimating mix-level logistic regression coefficients without using lme4 package in R 从逻辑回归模型 (R::lme4) 中提取具有 95% CI 的预测器效果 - Extracting a predictor's effect with 95% CIs from a logistic regression model (R::lme4) R:使用新lme4包的bootMer()进行Bootstrapped二进制混合模型逻辑回归 - R: Bootstrapped binary mixed-model logistic regression using bootMer() of the new lme4 package
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM