校正后的成对比较如何产生 p.value < 然后是单个 t.test？

Question

Hi suppose I have these results嗨，假设我有这些结果

df <- structure(list(len = c(4.2, 11.5, 7.3, 5.8, 6.4, 10, 11.2, 11.2, 
5.2, 7, 15.2, 21.5, 17.6, 9.7, 14.5, 10, 8.2, 9.4, 16.5, 9.7, 
16.5, 16.5, 15.2, 17.3, 22.5, 17.3, 13.6, 14.5, 18.8, 15.5, 19.7, 
23.3, 23.6, 26.4, 20, 25.2, 25.8, 21.2, 14.5, 27.3, 23.6, 18.5, 
33.9, 25.5, 26.4, 32.5, 26.7, 21.5, 23.3, 29.5, 25.5, 26.4, 22.4, 
24.5, 24.8, 30.9, 26.4, 27.3, 29.4, 23), supp = structure(c(2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("OJ", 
"VC"), class = "factor"), dose = structure(c(1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("D0.5", "D1", "D2"
), class = "factor")), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 
40L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 41L, 42L, 
43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 21L, 22L, 23L, 24L, 25L, 
26L, 27L, 28L, 29L, 30L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 
59L, 60L), class = "data.frame") 

df$int <- interaction(df$supp, df$dose)
e <- pairwise.t.test(df$len, df$int, p.adjust.method="BH")

so from the output所以从 output

        OJ.D0.5          VC.D0.5            OJ.D1     VC.D1            OJ.D2  
VC.D0.5 0.00285          -                  -         -                -      
OJ.D1   0.00000079391014 0.00000000000984   -         -                -      
VC.D1   0.04207          0.00000243821908 **0.00088** -                -      
OJ.D2   0.00000000042891 0.00000000000001   0.04645   0.00000089414918 -      
VC.D2   0.00000000042891 0.00000000000001   0.04474   0.00000085310153 0.96089

the comparison of, VC.D1 vs OJ.D1 = 0.00088 VC.D1 与 OJ.D1 的比较 = 0.00088

however a single t.test但是单个 t.test

t.test(df[df$supp == "VC" & df$dose == "D1", ]$len, 
       df[df$supp == "OJ" & df$dose == "D1", ]$len)

yields a p.value = p-value = 0.001038产生 p.value = p-value = 0.001038

so I most have messed up somewhere because shouldn't an adjusted p value be greater than a single uncorrected p value?所以我大多数人在某个地方搞砸了，因为调整后的 p 值不应该大于单个未纠正的 p 值吗？

Answer 1

Solution解决方案

You'll get the same results when you set p.adjust.method = "none" and pool.sd = FALSE :当您设置p.adjust.method = "none"和pool.sd = FALSE时，您将获得相同的结果：

pairwise.t.test(df$len, df$int, p.adjust.method = "none", pool.sd = FALSE)$p.value[3,3]
# 0.001038376

t.test(df[df$supp == "VC" & df$dose == "D1", ]$len, 
       df[df$supp == "OJ" & df$dose == "D1", ]$len)$p.value
# 0.001038376

Notes笔记

Just a reminder to always carefully read documentation and perform some sanity checks, to make sure the function does what you think it does.只是提醒您始终仔细阅读文档并执行一些完整性检查，以确保 function 符合您的想法。
This only illustrates where the difference comes from.这仅说明了差异的来源。 How to run it in your case will have to depend on your familiarity with the data.如何在您的情况下运行它必须取决于您对数据的熟悉程度。

Explanation解释

The comparison becomes much easier when we simply don't apply multiple testing correction.当我们根本不应用多重测试校正时，比较变得容易得多。 In that case, they should have the same p-value, right?在那种情况下，它们应该具有相同的 p 值，对吧？ So let's compare using p.adjust.method = "none" .因此，让我们使用p.adjust.method = "none"进行比较。 When running pairwise.t.test we now get 0.00059 ... closer, but still not right.运行pairwise.t.test时，我们现在得到0.00059 ... 更接近，但仍然不对。

The problem stems from the pool.sd argument.问题源于pool.sd参数。 This forces the use of a common standard deviation across all comparisons.这会强制在所有比较中使用共同的标准偏差。 This is useful in general (if the assumption is met), but does lead to different p-values.这通常很有用（如果满足假设），但确实会导致不同的 p 值。

When we look at the underlying code, this becomes clear:当我们查看底层代码时，这变得很清楚：

if (pool.sd) {
        METHOD <- "t tests with pooled SD"
        xbar <- tapply(x, g, mean, na.rm = TRUE)
        s <- tapply(x, g, sd, na.rm = TRUE)
        n <- tapply(!is.na(x), g, sum)
        degf <- n - 1
        total.degf <- sum(degf)
        pooled.sd <- sqrt(sum(s^2 * degf)/total.degf)
        compare.levels <- function(i, j) {
            dif <- xbar[i] - xbar[j]
            se.dif <- pooled.sd * sqrt(1/n[i] + 1/n[j])
            t.val <- dif/se.dif
            if (alternative == "two.sided") 
                2 * pt(-abs(t.val), total.degf)
            else pt(t.val, total.degf, lower.tail = (alternative == 
                "less"))
        }
    }

Amongst others, a total degrees of freedom is calculated across the tests ( total.degf ) which is then used to calculate a pooled standard deviation ( pooled.sd ).其中，计算整个测试的总自由度 ( total.degf )，然后用于计算合并标准偏差 ( pooled.sd )。

when we set pool.sd = FALSE , the code simply uses the t.test function:当我们设置pool.sd = FALSE时，代码仅使用t.test function：

    else {
        METHOD <- if (paired) 
            "paired t tests"
        else "t tests with non-pooled SD"
        compare.levels <- function(i, j) {
            xi <- x[as.integer(g) == i]
            xj <- x[as.integer(g) == j]
            t.test(xi, xj, paired = paired, alternative = alternative, 
                ...)$p.value
        }
    }

校正后的成对比较如何产生 p.value < 然后是单个 t.test？

问题描述

1 个解决方案

解决方案1
0 2021-12-19 08:42:40

校正后的成对比较如何产生 p.value &lt; 然后是单个 t.test？

问题描述

1 个解决方案

解决方案1 0 2021-12-19 08:42:40

校正后的成对比较如何产生 p.value < 然后是单个 t.test？

解决方案1
0 2021-12-19 08:42:40