简体   繁体   English

如何通过AIC在R中使用Gamma GLM进行模型选择?

[英]How can I do model selection by AIC with a Gamma GLM in R?

As the documentation for glm() explains, the aic component of the value returned by glm() is not a valid AIC: 正如glm()的文档所述,glm()返回的值的aic分量不是有效的AIC:

For gaussian, Gamma and inverse gaussian families the dispersion is estimated from the residual deviance, and the number of parameters is the number of coefficients plus one. 对于高斯,伽马和反高斯族,根据残差偏差来估计色散,参数的数量是系数的数量加一。 For a gaussian family the MLE of the dispersion is used so this is a valid value of AIC, but for Gamma and inverse gaussian families it is not. 对于高斯族,使用色散的MLE,因此这是AIC的有效值,但对于伽马和反高斯族则不是。

Thus a valid AIC needs to obtained in some other way. 因此,需要以其他方式获得有效的AIC。

If you want to use the step() or MASS::stepAIC() model selection functions, you could first ensure that the AIC is calculated properly by doing something like this: 如果要使用step()或MASS :: stepAIC()模型选择函数,则可以首先通过执行以下操作来确保正确计算了AIC:

GammaAIC <- function(fit){
  disp <- MASS::gamma.dispersion(fit)
  mu <- fit$fitted.values
  p <- fit$rank
  y <- fit$y
  -2 * sum(dgamma(y, 1/disp, scale = mu * disp, log = TRUE)) + 2 * p
}
GammaAICc <- function(fit){
  val <- logLik(fit)
  p <- attributes(val)$df
  n <- attributes(val)$nobs
  GammaAIC(fit) + 2 * p * (p + 1) / (n - p - 1)      
}

my_extractAIC <- function(fit, scale=0, k=2, ...){
  n <- length(fit$residuals)
  edf <- n - fit$df.residual  
  if (fit$family$family == "Gamma"){
    aic <- GammaAIC(fit)
  } else {
    aic <- fit$aic
  }
  c(edf, aic + (k - 2) * edf)
}
assignInNamespace("extractAIC.glm", my_extractAIC, ns="stats")

If you use the glmulti package, you can simply specify the use of the above GammaAIC() or GammaAICc() functions with the crit parameter of glmulti(). 如果您使用glmulti包,则只需使用glmulti()的crit参数指定使用上述GammaAIC()或GammaAICc()函数即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM