简体   繁体   中英

Model Averaging glmer problems

I am very new to this with limited statistical experience, so please bear with me. I am trying to run model average on my data using glmer .

My data has 3 explanatory categorical variables and have successfully run dredge() on them and their interactions to get AICc values. However, when I run model.avg() I get output for some of the models, but no output with others. This is what I have input.

ae <- read.csv(file=file.choose())
options(na.action="na.fail")
global.model<-glmer(
     cbind(numerator,total-numerator)~d+s+t+d:s:t+d:s+d:t+s:t+(1|random), 
     data=ae, family=binomial)   
options(max.print=1000000)
dredge(global.model,beta=c("none"),evaluate=TRUE,rank="AICc") 
ae.model <- glmer(
     cbind(numerator,total-numerator)~d+s+t+d:s:t+d:s+d:t+s:t+(1|random),
    data=ae,family=binomial)
models <- dredge(ae.model)  
summary(model.avg(get.models(models,subset=delta<5)))

An error message comes up:

Error in model.avg.default(get.models(models, subset = delta < 5)) : models are not unique. Duplicates: '2 = 3 = 4' and '10 = 11'

I really don't understand where I am going wrong and why I am getting an output for some interactions and not others.

Thanks in advance for any help given.

summary(ae)
  p                   t           day             hour            scan             random    behaviour  
 ae:182   blood        :42   Min.   :1.000   Min.   :1.000   Min.   : 0   ae_blood_1_1:  7   alert:182  
          egg          :35   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:10   ae_blood_1_2:  7              
          repellentfree:63   Median :2.000   Median :2.000   Median :30   ae_blood_1_3:  7              
          wolf         :42   Mean   :1.654   Mean   :1.962   Mean   :30   ae_blood_2_1:  7              
                             3rd Qu.:2.000   3rd Qu.:3.000   3rd Qu.:50   ae_blood_2_2:  7              
                             Max.   :3.000   Max.   :3.000   Max.   :60   ae_blood_2_3:  7              
                                                                          (Other)     :140              
   numerator           total      proportion        percentage      d                        s     
 Min.   : 0.0000   Min.   :17   Min.   :0.00000   Min.   : 0.000   E :14   1 - very light wind:21  
 1st Qu.: 0.0000   1st Qu.:17   1st Qu.:0.00000   1st Qu.: 0.000   SE:84   2 - light wind     :70  
 Median : 0.0000   Median :17   Median :0.00000   Median : 0.000   SW:35   3 - moderate wind  :77  
 Mean   : 0.5824   Mean   :17   Mean   :0.03426   Mean   : 3.426   W :49   4 - heavy wind     :14  
 3rd Qu.: 0.0000   3rd Qu.:17   3rd Qu.:0.00000   3rd Qu.: 0.000                                   
 Max.   :16.0000   Max.   :17   Max.   :0.94118   Max.   :94.118 

Guessing that the problem is with some of your interaction terms being redundant with each other because of some combination of your experimental design (which combinations of factors are actually represented) and the way you've written your factor. Guessing that you might have better luck expressing your model formula as cbind(numerator,total-numerator)~d*s*t+(1|random) , which will make it easier for R to automatically exclude redundant terms.

I'm having trouble reproducing this. (This isn't exactly an answer, but too long for a comment ...) When I sampled factor levels randomly to get the same number of replicates as in your data set, most of the interaction terms ended up being collinear so the model more or less collapsed. I constructed a factorial design (balanced, with 4x4x4x2 = 128 total observations) and added the rest of the necessary variables randomly:

set.seed(101)
ae <- expand.grid(d=c("E","SE","SW","W"),
                 s=c("very_light","light","moderate","heavy"),
                 t=c("blood","egg","rf","wolf"),
                 rep=1:2)
ae <- data.frame(ae,
      random=sample(LETTERS,size=nrow(ae),replace=TRUE),
      total=17,
      numerator=sample(c(0,16),prob=c(0.96,0.04),replace=TRUE,size=nrow(ae)))

(Note that the marginal distribution of your response variable is very skewed -- the third quartile is zero, max is 16/17, mean is only about 0.5, which implies you have mostly zeroes with a few large values. A binomial model might not work very well.)

This slightly stripped-down version of your code produces lots of warnings (in part because there's no actual signal in the response variable), but no errors (I used subset=TRUE to model-average all of the models because there was only one model with delta-AIC<5 in this example set):

library(lme4)
library(MuMIn)
options(na.action="na.fail")
ae.model <- glmer(
  cbind(numerator,total-numerator)~d+s+t+d:s:t+d:s+d:t+s:t+(1|random),
  data=ae,family=binomial)
models <- dredge(ae.model,trace=TRUE)  
summary(model.avg(get.models(models,subset=TRUE)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM