简体   繁体   中英

Getting standardized coefficients for a glmer model?

I've been asked to provide standardized coefficients for a glmer model, but am not sure how to obtain them. Unfortunately, the beta function does not work on glmer models:

Error in UseMethod("beta") : 
  no applicable method for 'beta' applied to an object of class "c('glmerMod', 'merMod')"

Are there other functions I could use, or would I have to write one myself?

Another problem is that the model contains several continuous predictors (which operate on similar scales) and 2 categorical predictors (one with 4 levels, one with six levels). The purpose of using the standardized coefficients would be to compare the impact of the categorical predictors to those of the continuous ones, and I'm not sure that standardized coefficients are the appropriate way to do so. Are standardized coefficients an acceptable approach?

The model is as follows:

model=glmer(cbind(nr_corr,maximum-nr_corr) ~ (condition|SUBJECT) + categorical_1 + categorical_2 + continuous_1 + continuous_2 + continuous_3 + continuous_4 + categorical_1:categorical_2 + categorical_1:continuous_3, data, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=100000)), family = binomial)

reghelper::beta simply standardizes the numeric variables in our dataset. So assuming your catagorical variables are factor s rather than numeric dummy variables or other contrast encodings we can fairly simply standardize the numeric variables in our dataset

vars <- grep('^continuous(.*)?', all.vars(formula(model)))
f <- function(var, data)
   scale(data[[var]])
data[, vars] <- lapply(vars, f, data = data)
update(model, data = data)

Now for the more general case we can more or less just as easily create our own beta.merMod function. However we will need to take into account whether or not it makes sense to standardize y . For example if we have a poisson model only positive integer values makes sense. In addition a question becomes whether or not to scale the random slope effects or not, and whether it makes sense to ask this question in the first place. In it I assume that categorical variables are encoded as character or factor and not numeric or integer .

beta.merMod <- function(model, 
                        x = TRUE, 
                        y = !family(model) %in% c('binomial', 'poisson'), 
                        ran_eff = FALSE, 
                        skip = NULL, 
                        ...){
  # Extract all names from the model formula
  vars <- all.vars(form <- formula(model))
  lhs <- all.vars(form[[2]])
  # Get random effects from the 
  ranef <- names(ranef(model))
  # Remove ranef and lhs from vars
  rhs <- vars[!vars %in% c(lhs, ranef)]
  # extract the data used for the model
  env <- environment(form)
  call <- getCall(model)
  data <- get(dname <- as.character(call$data), envir = env)
  # standardize the dataset
  vars <- character()
  if(isTRUE(x))
    vars <- c(vars, rhs)
  if(isTRUE(y))
    vars <- c(vars, lhs)
  if(isTRUE(ran_eff))
    vars <- c(vars, ranef)
  data[, vars] <- lapply(vars, function(var){
    if(is.numeric(data[[var]]))
      data[[var]] <- scale(data[[var]])
    data[[var]]
  })
  # Update the model and change the data into the new data.
  update(model, data = data)
}

The function works for both linear and generalized linear mixed effect models (not tested for nonlinear models), and is used just like other beta functions from reghelper

library(reghelper)
library(lme4)
# Linear mixed effect model
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
fm2 <- beta(fm1)
fixef(fm1) - fixef(fm2)
(Intercept)        Days 
  -47.10279   -19.68157 

# Generalized mixed effect model
data(cbpp)
# create numeric variable correlated with period
cbpp$nv <- 
  rnorm(nrow(cbpp), mean = as.numeric(levels(cbpp$period))[as.numeric(cbpp$period)])
gm1 <- glmer(cbind(incidence, size - incidence) ~ nv + (1 | herd),
              family = binomial, data = cbpp)
gm2 <- beta(gm1)
fixef(gm1) - fixef(gm2)
(Intercept)          nv 
  0.5946322   0.1401114

Note however that unlike beta the function returns the updated model not a summary of the model.

Another problem is that the model contains several continuous predictors (which operate on similar scales) and 2 categorical predictors (one with 4 levels, one with six levels). The purpose of using the standardized coefficients would be to compare the impact of the categorical predictors to those of the continuous ones, and I'm not sure that standardized coefficients are the appropriate way to do so. Are standardized coefficients an acceptable approach?

Now that is a great question and one better suited for stats.stackexchange , and not one I'm certain of the answer to.

Again, thank you so much, Oliver! For anybody who is interested in the answer regarding the last part of my question,

Another problem is that the model contains several continuous predictors (which operate on similar scales) and 2 categorical predictors (one with 4 levels, one with six levels). The purpose of using the standardized coefficients would be to compare the impact of the categorical predictors to those of the continuous ones, and I'm not sure that standardized coefficients are the appropriate way to do so. Are standardized coefficients an acceptable approach?

you can find the answer here . The tl;dr is that using standardized regression coefficients is not the best approach for mixed models anyways, let alone one such as mine...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM