简体   繁体   中英

How to strip down the glm model?

The object returned by glm contains residuals , fitted values , effects , qr$qr , linear.predictors , weights &c &c which add up to a humongous object (if the input is big enough).

How do I strip it down so that something like predict will still work?

Ideally, I want a function which would return a small function object equivalent to function(x) predict(my_model,data.frame(x=x)) ; something like as.stepfun for isoreg .

Most of the model components are descriptive, and are not necessary for predict to work. A helper function ( HT: R-Bloggers ) can be used to remove the fat:

stripGlmLR = function(cm) {
  cm$y = c()
  cm$model = c()

  cm$residuals = c()
  cm$fitted.values = c()
  cm$effects = c()
  cm$qr$qr = c()  
  cm$linear.predictors = c()
  cm$weights = c()
  cm$prior.weights = c()
  cm$data = c()


  cm$family$variance = c()
  cm$family$dev.resids = c()
  cm$family$aic = c()
  cm$family$validmu = c()
  cm$family$simulate = c()
  attr(cm$terms,".Environment") = c()
  attr(cm$formula,".Environment") = c()

  cm
}

Now you can apply it to your model for a 5+ order-of-magnitude reduction in size (in this example):

traindata <- data.frame(x = rnorm(1e6), y = rnorm(1e6))
testdata <- data.frame(x = rnorm(10))

mod1 <- glm(y~x, data= traindata)
mod2 <- stripGlmLR(mod1)

format(object.size(mod1), units = "Kb")
# [1] "492234.5 Kb"
format(object.size(mod2), units = "Kb")
# [1] "18.5 Kb"

all(predict(object = mod1, newdata = testdata) == 
    predict(object = mod2, newdata = testdata))
# [1] TRUE

Note that if you want to be able to use the full suite of glm methods, you will need to retain other components of the model.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM