简体   繁体   中英

Error message in mgcv GAM when using family = gaulss()

I'm trying to use hierarchical general additive models in mgcv using the Gaussian location-scale model family. However, this family of functions throws a cryptic error when trying to fit.

Followed the guidance on HGAMs from a recent paper ( https://peerj.com/articles/6876/ ) along with the guidance from the notes in the lmvar package write-up ( https://cran.r-project.org/web/packages/lmvar/vignettes/Intro.html ).


library(mgcv); library(datasets)

# Using the CO2 dataset for illustration

data <- datasets::CO2

# Changing the 'Plant' factor from ordered to unordered

data$Plant <- factor(data$Plant, ordered = FALSE)

# This model fits with no errors

CO2_modG_1 <- gam(log(uptake) ~ s(log(conc), k = 5, bs = 'tp') + s(Plant, k = 12, bs = 're'), data = data, method = 'REML', family = 'gaussian')

# This model fails with error

CO2_modG_2 <- gam(log(uptake) ~ s(log(conc), k = 5, bs = 'tp') + s(Plant, k = 12, bs = 're'), data = data, method = 'REML', family = gaulss())

This returns the error message:


Error in while (sqrt(mean(dlb/(dlb + lami * dS * rm)) * mean(dlb)/mean(dlb +  : 
  missing value where TRUE/FALSE needed

You have to pass a list of two formula objects to gam() when fitting a Gaussian location-scale model with the gaulss() family.

You don't say how you want to model the scale component of the model, so here's an example which should be equivalent to the gaussian() family where we have a constant term for the scale and a the linear predictor you provided for the mean.

CO2_modG_2 <- gam(list(log(uptake) ~ s(log(conc), k = 5, bs = 'tp') + 
                         s(Plant, k = 12, bs = 're'), 
                       ~ 1), 
                  data = data, method = 'REML', family = gaulss())

If you want to allow each plant to have it's own variance, say, then we can add terms to the second formula:

CO2_modG_3 <- gam(list(log(uptake) ~ s(log(conc), k = 5, bs = 'tp') + 
                         s(Plant, k = 12, bs = 're'), 
                       ~ s(Plant, k = 12, bs = 're')), 
                  data = data, method = 'REML', family = gaulss())

Importantly, you have to provide a list of two formula objects to this family and the second formula should only have a tilde plus the right-hand-side of the formula specifying the linear predictor for the scale parameter.

list(response ~ x1 + s(x2), # location linear predictor, left & right hand sided
              ~ x1 + s(x3)  # scale linear predictor, right-hand side only
    )

hence, as per the first example I showed above, if you want a constant term only in the linear predictor for the scale you need to use R's notation for the intercept or constant term; ~ 1

list(response ~ x1 + s(x2), # location linear predictor, left & right hand sided
              ~ 1           # scale linear predictor, constant
    )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM