[英]Different versions of R, lme4 and OS X give different fixed-effects significance results in glmer
我在包lme4中使用glmer()運行logit混合效果模型。 該實驗使用受試者內部項目設計,其中主題和項目為交叉隨機效應。
我的問題:不同版本的R和lme4(在不同的OS X上運行)對固定效果產生不同的標准誤差估計,因此產生不同的顯着性結果。
這是我的數據的子集(來自最后兩個主題的數據):
structure(list(SubjN = c(87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L,
87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L,
87L, 87L, 87L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L,
88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L,
88L), Items = structure(c(3L, 10L, 11L, 5L, 1L, 12L, 2L, 6L,
9L, 6L, 3L, 4L, 8L, 11L, 12L, 7L, 8L, 2L, 7L, 10L, 9L, 5L, 1L,
4L, 10L, 3L, 5L, 11L, 12L, 1L, 2L, 6L, 9L, 6L, 3L, 4L, 8L, 11L,
12L, 7L, 2L, 8L, 10L, 7L, 9L, 5L, 1L, 4L), .Label = c("a", "c",
"k", "f", "g", "i", "d", "l", "e", "j", "b", "h"), class = "factor"),
IV1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("N", "L", "P"
), class = "factor"), DV = c(0, 0, 1, 0, 1, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1,
0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
IV1.h = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), contrasts = structure(c(-1,
0.5, 0.5, 0, -0.5, 0.5), .Dim = c(3L, 2L), .Dimnames = list(
c("N", "L", "P"), c("N_vs_L&P", "L_vs_P"))), .Label = c("N",
"L", "P"), class = "factor"), N_vs_LP = c(-1, -1, -1, -1,
-1, -1, -1, -1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, -1, -1, -1, -1, -1, -1,
-1, -1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
0.5, 0.5, 0.5, 0.5, 0.5, 0.5), L_vs_P = c(0, 0, 0, 0, 0,
0, 0, 0, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0, 0, 0, 0, 0, 0,
0, 0, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, 0.5,
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5)), .Names = c("SubjN",
"Items", "IV1", "DV", "IV1.h", "N_vs_LP", "L_vs_P"), row.names = c("3099",
"3100", "3101", "3102", "3103", "3104", "3119", "3120", "3107",
"3108", "3109", "3110", "3097", "3098", "3105", "3106", "3115",
"3116", "3117", "3118", "3111", "3112", "3113", "3114", "3147",
"3148", "3149", "3150", "3151", "3152", "3167", "3168", "3155",
"3156", "3157", "3158", "3145", "3146", "3153", "3154", "3163",
"3164", "3165", "3166", "3159", "3160", "3161", "3162"), class = "data.frame")
每個受試者在3個不同條件下進行24次試驗(因子IV1,水平:N,L,P)。 我記錄他們是否產生了目標語言結構(DV == 1)或不產生(DV == 0)。 在分析中,我只包括那些產生目標結構的受試者至少一個。 盡管如此,他們中的大多數只在極少數情況下產生了目標結構。 這是每個條件下每個主題產生的DV == 1的比例:
library(plyr)
#dput(ddply(mydata, .(SubjN, IV1), summarise, l = length(DV), y = round(mean(DV),2)))
structure(list(SubjN = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L,
4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L,
9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 13L,
13L, 14L, 14L, 14L, 15L, 15L, 15L, 16L, 16L, 16L, 17L, 17L, 17L,
18L, 18L, 18L, 19L, 19L, 19L, 20L, 20L, 20L, 21L, 21L, 21L, 22L,
22L, 22L, 23L, 23L, 23L, 24L, 24L, 24L, 25L, 25L, 25L, 26L, 26L,
26L, 27L, 27L, 27L, 28L, 28L, 28L, 29L, 29L, 29L, 30L, 30L, 30L,
31L, 31L, 31L, 32L, 32L, 32L, 33L, 33L, 33L, 34L, 34L, 34L, 35L,
35L, 35L, 36L, 36L, 36L, 37L, 37L, 37L, 38L, 38L, 38L, 39L, 39L,
39L, 40L, 40L, 40L, 41L, 41L, 41L, 42L, 42L, 42L, 43L, 43L, 43L,
44L, 44L, 44L, 45L, 45L, 45L, 46L, 46L, 46L, 47L, 47L, 47L, 48L,
48L, 48L, 49L, 49L, 49L, 50L, 50L, 50L, 51L, 51L, 51L, 52L, 52L,
52L, 53L, 53L, 53L, 54L, 54L, 54L, 55L, 55L, 55L, 56L, 56L, 56L,
57L, 57L, 57L, 58L, 58L, 58L, 59L, 59L, 59L, 60L, 60L, 60L, 61L,
61L, 61L, 62L, 62L, 62L, 63L, 63L, 63L, 64L, 64L, 64L, 65L, 65L,
65L, 66L, 66L, 66L, 67L, 67L, 67L, 68L, 68L, 68L, 69L, 69L, 69L,
70L, 70L, 70L, 71L, 71L, 71L, 72L, 72L, 72L, 73L, 73L, 73L, 74L,
74L, 74L, 75L, 75L, 75L, 76L, 76L, 76L, 77L, 77L, 77L, 78L, 78L,
78L, 79L, 79L, 79L, 80L, 80L, 80L, 81L, 81L, 81L, 82L, 82L, 82L,
83L, 83L, 83L, 84L, 84L, 84L, 85L, 85L, 85L, 86L, 86L, 86L, 87L,
87L, 87L, 88L, 88L, 88L), IV1 = structure(c(1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L), .Label = c("N", "L", "P"), class = "factor"), l = c(8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 8L, 7L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 8L, 8L, 8L, 8L, 8L, 8L,
7L, 8L, 6L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 8L, 8L, 7L, 7L, 8L, 7L, 8L,
8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 6L, 8L, 4L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L,
8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L,
8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 7L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L), y = c(1, 0.88, 1, 0.5, 0.25, 0.62,
0, 0, 0.25, 0, 0.25, 0, 0.12, 0, 0, 0, 0.12, 0, 0, 0.12, 0.12,
0, 0, 0.12, 0.38, 0, 0.25, 0, 0.12, 0, 0.12, 0, 0.25, 0, 0, 0.12,
0.5, 0.25, 0.5, 0, 0, 0.12, 0, 0.25, 0.12, 0, 0, 0.12, 0, 0.12,
0, 0, 0.12, 0.12, 0.12, 0.62, 0, 0, 0.5, 0.25, 1, 0.88, 1, 0,
0, 0.12, 0, 0.12, 0.12, 0.12, 0.12, 0, 0.62, 0.62, 0.38, 0.5,
0.88, 0.12, 0.12, 0, 0, 0.12, 0.12, 0, 0, 0.12, 0, 0, 0.12, 0,
0, 0.12, 0, 0, 0.25, 0, 0, 0.14, 0, 0.5, 0.57, 0.29, 0, 0.12,
0, 0, 0.12, 0, 0.25, 0.5, 0.25, 0, 0.12, 0.12, 0.25, 0, 0.38,
0, 0, 0.12, 0, 0, 1, 0.25, 0.12, 0.25, 0, 0.12, 0.12, 0, 0, 0.12,
0, 0, 0.12, 0.12, 0, 0, 0.12, 0, 0.14, 0.14, 0.12, 0, 0.12, 0,
0, 0.12, 0.12, 0, 1, 0.88, 1, 0, 0.12, 0, 0.12, 0, 0, 0.12, 0,
0.12, 0, 0, 0.12, 0.12, 0.12, 0.12, 1, 1, 1, 0.12, 0, 0, 0.12,
0.38, 0, 0, 0.12, 0, 0, 0, 0.5, 0.5, 0, 0.25, 0, 0.12, 0.29,
0, 0, 0.38, 0, 0, 0.62, 0.5, 0, 0.12, 0, 0.12, 0.12, 0.25, 0.12,
0.25, 0.12, 0, 0.12, 0, 0, 0.12, 0, 0, 0.12, 0, 0.12, 0.12, 0,
0.12, 0.12, 0, 0, 0.12, 0.12, 0.12, 0, 0.38, 0.12, 0.57, 0, 0.12,
0, 0, 0.12, 0, 0, 0.12, 0, 0, 0.12, 0.14, 0.88, 0.88, 0.86, 0,
0, 0.14, 0, 0.12, 0.14, 0, 0.12, 0, 0, 0, 0.12, 0, 0, 0.12, 0.38,
0, 0, 0.5, 0.12, 0)), .Names = c("SubjN", "IV1", "l", "y"), row.names = c(NA,
-264L), class = "data.frame")
我運行以下模型,包括IV1作為具有helmert對比度編碼的固定效果; 第一對比:N vs. L&P,第二對比:L vs. P.
m1 <- glmer(DV ~ IV1.h + (1 + IV1.h|SubjN) + (1|Items) + (0 + N_vs_LP|Items) + (0 + L_vs_P|Items), family ='binomial', mydata)
該模型不允許逐項隨機變量之間的相關性(我通過為兩個對比創建單獨的斜率來實現這一點),因為當允許相關時它們是完全相關的(我將其解釋為過度參數化的標志) 。
1)結果使用os x 10.8.5山獅R版3.0.2(2013-09-25)lme4_1.0-5(我運行的原始分析)
Generalized linear mixed model fit by maximum likelihood ['glmerMod']
Family: binomial ( logit )
Formula: DV ~ IV1.h + (1 + N_vs_LP + L_vs_P | SubjN) + (1 | Items) + (0 + N_vs_LP | Items) + (0 + L_vs_P | Items)
Data: mydata
AIC BIC logLik deviance
1492.5408 1560.2050 -734.2704 1468.5408
Random effects:
Groups Name Variance Std.Dev. Corr
SubjN (Intercept) 2.3885505 1.54549
N_vs_LP 0.4394195 0.66289 -0.69
L_vs_P 1.9287559 1.38880 0.04 0.08
Items (Intercept) 0.0531518 0.23055
Items.1 N_vs_LP 0.0001950 0.01396
Items.2 L_vs_P 0.0003619 0.01902
Number of obs: 2077, groups: SubjN, 88; Items, 12
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.2998 0.1964 -11.710 < 2e-16 ***
IV1.hN_vs_L&P 0.3704 0.1378 2.689 0.00717 **
IV1.hL_vs_P 0.2060 0.2320 0.888 0.37459
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) IV1.N_
IV1.hN_vs_L&P -0.388
IV1.hL_vs_P 0.014 0.019
2)結果使用:OS X 10.9.4 Mavericks R 3.1.1版(2014-07-10)lme4_1.1-7優化器'bobyqa'
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: DV ~ IV1.h + (1 + N_vs_LP + L_vs_P | SubjN) + (1 | Items) + (0 +
N_vs_LP | Items) + (0 + L_vs_P | Items)
Data: mydata
Control: glmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
1492.5 1560.2 -734.3 1468.5 2065
Scaled residuals:
Min 1Q Median 3Q Max
-2.4174 -0.3364 -0.2595 -0.1706 4.6028
Random effects:
Groups Name Variance Std.Dev. Corr
SubjN (Intercept) 2.38791 1.5453
N_vs_LP 0.43935 0.6628 -0.69
L_vs_P 1.92629 1.3879 0.04 0.07
Items (Intercept) 0.05319 0.2306
Items.1 N_vs_LP 0.00000 0.0000
Items.2 L_vs_P 0.00000 0.0000
Number of obs: 2077, groups: SubjN, 88; Items, 12
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.2998 0.2095 -10.975 <2e-16 ***
IV1.hN_vs_L&P 0.3703 0.1892 1.958 0.0503 .
IV1.hL_vs_P 0.2063 0.2679 0.770 0.4413
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) IV1.N_
IV1.hN__L&P -0.379
IV1.hL_vs_P -0.001 0.003
我真的不知道應該相信哪個結果。 任何幫助將非常感謝。
PS。 對不起,如果事情不明確 - 這是我的第一篇帖子:)
非常感謝!
來自lme4
的NEWS文件 ,版本為1.1-4
現在默認情況下從近似Hessian計算固定效應的標准誤差(參見vcov.merMod中的use.hessian參數); 當隨機和固定效應參數的估計相關時,這給出了更好(正確)的答案(Github#47)
問題的描述在這里
您應該能夠通過sqrt(diag(vcov(fitted_model,use.hessian=FALSE)))
從較新的(1.1-7)模型中檢索舊的標准錯誤,但新版本更可能是正確的。
對於更精確的置信區間/ p值,您可以進行似然比檢驗(使用anova
比較嵌套模型)和/或使用confint(fitted_model,which="beta_")
計算輪廓置信區間confint(fitted_model,which="beta_")
。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.