简体   繁体   English

如何从合并的GLM中提取AIC和对数可能性?

[英]How to extract AIC and Log Likelihood from pooled GLM?

I've imputed data using the MICE package. 我已经使用MICE套件估算了数据。 Now I would like to present the results of a GLM based on the pooled data. 现在,我想介绍基于合并数据的GLM的结果。

This is how I came up with the data: 这就是我想到的数据:

data.imputed <- mice(data, m=5, maxit = 50, method = 'pmm', seed = 500)

And this is what I used to create the model: 这就是我用来创建模型的内容:

model.imputed1 <- with(data = data.imputed, expr = glm(dv ~ iv1 + iv2 + iv3, family=binomial))

model.imputed <- pool(model.imputed1)

However, when I run 但是,当我跑步时

AIC(model.imputed)

or 要么

logLik(model.imputed)

for that matter, I receive the message 为此,我收到消息

Error in UseMethod("logLik") : no applicable method for 'logLik' applied to an object of class "c('mipo', 'data.frame')" UseMethod(“ logLik”)中的错误:没有适用于'logLik'的适用方法应用于类“ c('mipo','data.frame')”的对象

This looks like it has something to do with the way mice stores its imputed files. 看起来它与鼠标存储其估算文件的方式有关。 Is there a way to extract these two metrics (AIC and logLik) from this model? 有没有办法从该模型中提取这两个指标(AIC和logLik)? How could I convert it into a model from which to extract these two metrics? 如何将其转换为模型以从中提取这两个指标?

Thanks! 谢谢!

TT TT

Looking at the structure of the pool result it seems that mice::pool does not store this information. 查看pool结果的结构,似乎mice::pool不存储此信息。

str(pool(model.imputed1))
#Classes ‘mipo’ and 'data.frame':   0 obs. of  3 variables:
#  $ call  : language pool(object = model.imputed1)
#$ m     : int 40
#$ pooled:'data.frame': 3 obs. of  9 variables:
#  ..$ estimate: num  0.0722 -0.2533 -0.8663
#..$ ubar    : num  0.000422 0.000318 0.029756
#..$ b       : num  2.53e-06 3.41e-05 3.95e-04
#..$ t       : num  0.000425 0.000353 0.030162
#..$ dfcom   : int  10060 10060 10060
#..$ df      : num  9902 2765 9487
#..$ riv     : num  0.00615 0.10989 0.01362
#..$ lambda  : num  0.00611 0.09901 0.01343
#..$ fmi     : num  0.00631 0.09966 0.01364

I am not sure whether Rubin's rules function in the same manner when combining stats like AIC and LL, but one thing you can do is get the AIC and LL for each dataset. 我不确定鲁宾的规则在组合AIC和LL等统计信息时是否以相同的方式起作用,但是您可以做的一件事就是为每个数据集获取AIC和LL。 Since you only have 5 datasets this should not take long. 由于您只有5个数据集,因此无需花费很长时间。

First retrieve all the completed datasets in long format. 首先以长格式检索所有完成的数据集。

L_df <- mice::complete(data.imputed,"long",include = F) 

Then create some empty vectors and retrieve the number of imputations (m = 5 in your case). 然后创建一些空向量并获取插补数(在您的情况下为m = 5)。

AIC1<-c()
logLik1 <- c()
m <- max(L_df$.imp)

Then estimate the model for each dataset and store the AIC and LL in the empty vectors just created. 然后估计每个数据集的模型,并将AIC和LL存储在刚创建的空向量中。

for(i in 1:m){
  model.imputed1 <- glm(dv ~ iv1 + iv2 + iv3, family=binomial, data = L_df[which(L_df$.imp == m),])
  AIC1[i] <- AIC(model.imputed1)
  logLik1[i] <- logLik(model.imputed1)
}

The result of this loop should be 5 values for AIC stored in AIC1 and 5 values of the LL stored in logLik1 . 该循环的结果应为AIC1存储的AIC的5个值和logLik1存储的LL的5个值。 You could use these values for reporting the average AIC and its variance between datasets, or report more robust measures such as the median and range (since you only have 5 values). 您可以使用这些值来报告数据集之间的平均AIC及其方差,也可以报告更可靠的度量值,例如中位数和范围(因为只有5个值)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM