用于单向混合效应方差分析的Tukey测试中获得错误的p值

Question

I am trying to replicate a textbook example of a Tukey Test for One-Way Mixed Effect ANOVA (from Statistics, William L. Hays p 581-583) but the p-values I am getting using lme & glht don't make sense 我正在尝试复制单向混合效应方差分析的Tukey测试的教科书示例（摘自Statistics，William L. Hays p 581-583），但是我使用lme＆glht获得的p值没有意义

The study has repeated measures of four treatments and 10 subjects 该研究对4种疗法和10个受试者的重复测量

The Data 数据

subject=c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 
    6, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10)

treatment=c("a1", "a2", "a3", "a4", "a1", "a2", "a3", "a4", "a1", "a2", "a3", 
    "a4", "a1", "a2", "a3", "a4", "a1", "a2", "a3", "a4", "a1", "a2", 
    "a3", "a4", "a1", "a2", "a3", "a4", "a1", "a2", "a3", "a4", "a1", 
    "a2", "a3", "a4", "a1", "a2", "a3", "a4")

response=c(11, 9, 5, 17, 14, 12, 10, 18, 15, 7, 6, 21, 17, 10, 13, 22, 15, 7, 6, 
    15, 14, 8, 13, 22, 9, 6, 7, 15, 17, 11, 10, 19, 10, 13, 14, 23, 12, 
    8, 11, 20)

dataFrame=data.frame(subject, treatment, response)

The model 该模型

library(nlme)
model = lme(response~ treatment,random=~1|subject,data=dataFrame)
anova(model)

            numDF denDF  F-value p-value
(Intercept)     1    27 375.9198  <.0001
treatment       3    27  43.4507  <.0001

This F-value is close enough to Hay's ( F = 43.41 ) that I am pretty sure my model is fine. 这个F值足够接近Hay的（ F = 43.41 ），我可以肯定我的模型很好。

The Tukey Test 图基测试

library(multcomp)
glht.out =glht(model, mcp(treatment="Tukey"))
summary(glht.out)

         Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: lme.formula(fixed = response ~ treatment, data = dataFrame, random = ~1 | 
    subject)

Linear Hypotheses:
             Estimate Std. Error z value Pr(>|z|)    
a2 - a1 == 0   -4.300      1.006  -4.276   <0.001 ***
a3 - a1 == 0   -3.900      1.006  -3.879   <0.001 ***
a4 - a1 == 0    5.800      1.006   5.768   <0.001 ***
a3 - a2 == 0    0.400      1.006   0.398    0.979    
a4 - a2 == 0   10.100      1.006  10.044   <0.001 ***
a4 - a3 == 0    9.700      1.006   9.647   <0.001 ***

This is not consistent with the book. 这与书不符。 Hay's only reported some comparisons and gave HSD and mean, not p but for the a3-a1 contrast found there was no significant difference with an HSD = 4.02 and mean = 3.9 which by my calculation has p.value = 1-ptukey(3.9*sqrt(10/8.27),4,9)=0.05719563 . Hay仅报告了一些比较，并给出了HSD和均值，而不是p，但对于a3-a1对比发现， HSD = 4.02和mean = 3.9没有显着差异，根据我的计算p.value = 1-ptukey(3.9*sqrt(10/8.27),4,9)=0.05719563 。

The R output also doesn't make sense because the Tukey test p-value, which is supposed to control for multiple comparisons, is much much smaller than the p-value from a paired t-test (p=0.0142 using t.test(c(11, 14, 15, 17, 15, 14, 9, 17, 10, 12),c(5, 10, 6, 13, 6, 13, 7, 10, 14, 11), paired=TRUE) . R输出也没有意义，因为应该控制多重比较的Tukey测试p值远小于配对t检验的p值（使用t.test(c(11, 14, 15, 17, 15, 14, 9, 17, 10, 12),c(5, 10, 6, 13, 6, 13, 7, 10, 14, 11), paired=TRUE) p = t.test(c(11, 14, 15, 17, 15, 14, 9, 17, 10, 12),c(5, 10, 6, 13, 6, 13, 7, 10, 14, 11), paired=TRUE) 。

Any idea what I am doing wrong and how I can correctly perform a Tukey test in R? 知道我做错了什么以及如何在R中正确执行Tukey测试吗？

Answer 1

I investigated your question and these are my findings. 我调查了您的问题，这些是我的发现。 Unfortunatelly I don't have the Hay's book so I just focused on the info you posted. 不幸的是，我没有Hay的书，所以我只关注您发布的信息。

1) Regarding the HSD = 4.02 reported by Hay's book. 1）关于Hay的书报道的HSD = 4.02。 The lme function uses the RELM fitting method which maximises the restricted log-likelihood while the LM method maximises the log-likelihood, ?lme for more info. lme函数使用RELM拟合方法，该方法可最大化受限对数似然性，而LM方法可最大化对数似然性，即?lme以获得更多信息。 The model fit with the ML method shows the same F-value. 与ML方法拟合的模型显示相同的F值。

model <- lme(response~treatment, random=~1|subject, data=dataFrame, method="ML")
anova.lme(model)
            numDF denDF  F-value p-value
(Intercept)     1    27 375.9227  <.0001
treatment       3    27  43.4506  <.0001

When running the multiple comparisons with glht, the z value is closer to the one reported by the book 4.088, but the contrast still finds the difference significat. 当使用glht进行多重比较时， z value更接近于书中4.088所报告的值，但是对比仍然发现差异有意义。

glht.out <- glht(model, mcp(treatment="Tukey"), alternative="two.sided")
summary(glht.out)

     Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: lme.formula(fixed = response ~ treatment, data = dataFrame, random = ~1 | 
    subject, method = "ML")

Linear Hypotheses:
             Estimate Std. Error z value Pr(>|z|)    
a2 - a1 == 0  -4.3000     0.9539  -4.508  < 1e-04 ***
a3 - a1 == 0  -3.9000     0.9539  -4.088  0.00027 ***
a4 - a1 == 0   5.8000     0.9539   6.080  < 1e-04 ***
a3 - a2 == 0   0.4000     0.9539   0.419  0.97520    
a4 - a2 == 0  10.1000     0.9539  10.588  < 1e-04 ***
a4 - a3 == 0   9.7000     0.9539  10.168  < 1e-04 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

2) Regarding the calculation of the p-value. 2）关于p值的计算。 The previous output shows that the p-values are ajusted using the single-step method . 先前的输出显示，使用single-step method调整了p值。 The ?summary.glht description explains that any methods implemented in adjusted and p.adjust can be used. 本?summary.glht说明解释，在实施任何方法adjusted和p.adjust可以使用。 Here is an example of how to change it: 这是一个如何更改它的示例：

summary(glht.out, test = adjusted(type = "single-step"))
summary(glht.out, test = adjusted(type = "none"))

Unfortunatelly, I couldn't reproduce the p-values from the glht output. 不幸的是，我无法从glht输出中重现p值。 So I checked out the TukeyHSD function instead. 因此，我改为检出TukeyHSD函数。 It requires the aov model output. 它需要aov模型输出。

aovmodel <- aov(model)
summary(aovmodel)
            Df Sum Sq Mean Sq F value   Pr(>F)    
treatment    3  659.0  219.67   26.95 2.57e-09 ***
Residuals   36  293.4    8.15                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

When running the TukeyHSD we get the following output 运行TukeyHSD时，我们得到以下输出

TukeyHSD(aovmodel, "treatment")
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = model)

$treatment
      diff       lwr        upr     p adj
a2-a1 -4.3 -7.738482 -0.8615177 0.0093920
a3-a1 -3.9 -7.338482 -0.4615177 0.0210165
a4-a1  5.8  2.361518  9.2384823 0.0003378
a3-a2  0.4 -3.038482  3.8384823 0.9891641
a4-a2 10.1  6.661518 13.5384823 0.0000000
a4-a3  9.7  6.261518 13.1384823 0.0000000

The code is easier to understand than the one from multcomp, at least for me. 至少对于我来说，该代码比multcomp的代码更易于理解。 The p-values can be reproduced as follow: p值可以复制如下：

MSE <- 8.15
center <- -3.9/sqrt(MSE/10)  # -4.32002
ptukey(abs(center), 4, 36, lower.tail=FALSE)
[1] 0.02101645

Note that the HSD is now -4.32002 and the degree of freedom is Nk = 40 - 4 = 36. The p-value is 0.021 which is a slightly higher than one from the paired t-test. 请注意，HSD现在为-4.32002，自由度为Nk = 40-4 =36。p值为0.021，略高于配对t检验的p值。 Here is a Gist to reproduce the output. 这是重现输出的要点。

Hope this helps 希望这可以帮助

用于单向混合效应方差分析的Tukey测试中获得错误的p值

问题描述

1 个解决方案

解决方案1
0 2018-02-21 10:51:20

用于单向混合效应方差分析的Tukey测试中获得错误的p值

问题描述

1 个解决方案

解决方案1 0 2018-02-21 10:51:20

解决方案1
0 2018-02-21 10:51:20