简体   繁体   English

如何使用相同的参考向量在R中执行多个成对的t.test?

[英]How can I perform multiple pairwise t.test in R using the same reference vector?

Let's consider the following vectors in the dataframe: 让我们考虑数据帧中的以下向量:

ctrl <- rnorm(50)
x1 <- rnorm(30, mean=0.2)
x2 <- rnorm(100,mean=0.1)
x3 <- rnorm(100,mean=0.4)

x <- data.frame(data=c(ctrl,x1,x2,x3),
            Group=c(
              rep("ctrl", length(ctrl)),
              rep("x1", length(x1)),
              rep("x2", length(x2)),
              rep("x3", length(x3))) )

I know I could use 我知道我可以用

pairwise.t.test(x$data,
            x$Group,
            pool.sd=FALSE)

to get pairwise comparison like 得到像成对的比较

 Pairwise comparisons using t tests with non-pooled SD 

data:  x$data and x$Group 

   ctrl    x1      x2     
x1 0.08522 -       -      
x2 0.99678 0.10469 -      
x3 0.00065 0.99678 2.8e-05

P value adjustment method: holm 

However I am not interested in every possible combination of vectors. 但是,我对矢量的每种可能组合都不感兴趣。 I am seeking a way to compare ctrl vector with every other vectors, and to take into account alpha inflation. 我正在寻找一种方法,可以将ctrl矢量与其他所有矢量进行比较,并考虑到alpha膨胀。 I'd like to avoid 我想避免

t.test((x$data[x$Group=='ctrl']), (x$data[x$Group=='x1']), var.equal=T)
t.test((x$data[x$Group=='ctrl']), (x$data[x$Group=='x2']), var.equal=T)
t.test((x$data[x$Group=='ctrl']), (x$data[x$Group=='x3']), var.equal=T)

And then perform manual correction for multiple comparisons. 然后对多个比较执行手动校正。 What would be the best way to do so ? 最好的方法是什么?

You can use p.adjust to get a Bonferroni adjustment to multiple p-values. 您可以使用p.adjust获得对多个p值的Bonferroni调整。 You should not bundle thos unequal length vectors inot t adataframe but rather use a list. 您不应将不等长的向量捆绑到数据帧中,而应使用列表。

ctrl <- rnorm(50)
x1 <- rnorm(30, mean=0.2)
x2 <- rnorm(100,mean=0.1)
x3 <- rnorm(100,mean=0.4)

> lapply( list(x1,x2,x3), function(x) t.test(x,ctrl)$p.value)
[[1]]
[1] 0.2464039

[[2]]
[1] 0.8576423

[[3]]
[1] 0.0144275

> p.adjust( .Last.value)
[1] 0.4928077 0.8576423 0.0432825

@BondedDust 's answer looks great. @BondedDust的答案看起来不错。 I provide a bit more complicated solution if you really need to work with dataframes. 如果您确实需要使用数据框,则我提供了更为复杂的解决方案。

library(dplyr)

ctrl <- rnorm(50)
x1 <- rnorm(30, mean=0.2)
x2 <- rnorm(100,mean=0.1)
x3 <- rnorm(100,mean=0.4)

x <- data.frame(data=c(ctrl,x1,x2,x3),
                Group=c(
                  rep("ctrl", length(ctrl)),
                  rep("x1", length(x1)),
                  rep("x2", length(x2)),
                  rep("x3", length(x3))), stringsAsFactors = F )

# provide the combinations you want
# set1 with all from set2
set1 = c("ctrl")
set2 = c("x1","x2","x3")

dt_res =
      data.frame(expand.grid(set1,set2)) %>%  # create combinations 
        mutate(test_id = row_number()) %>%    # create a test id
        group_by(test_id) %>%              # group by test id, so everything from now on is performed for each test separately
        do({x_temp = x[(x$Group==.$Var1 | x$Group==.$Var2),]    # for each test id keep groups of interest
            x_temp = data.frame(x_temp)}) %>%
        do(test = t.test(data~Group, data=.))     # perform the test and save it

# you create a dataset that has the test id and a column with t.tests results as elements
dt_res

# Source: local data frame [3 x 2]
# Groups: <by row>
#   
#     test_id       test
#   1       1 <S3:htest>
#   2       2 <S3:htest>
#   3       3 <S3:htest>


# get all tests as a list
dt_res$test

# [[1]]
# 
# Welch Two Sample t-test
# 
# data:  data by Group
# t = -1.9776, df = 58.36, p-value = 0.05271
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
#   -0.894829477  0.005371207
# sample estimates:
#   mean in group ctrl   mean in group x1 
# -0.447213560       -0.002484425 
# 
# 
# [[2]]
# 
# Welch Two Sample t-test
# 
# data:  data by Group
# t = -2.3549, df = 100.68, p-value = 0.02047
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
#   -0.71174095 -0.06087081
# sample estimates:
#   mean in group ctrl   mean in group x2 
# -0.44721356        -0.06090768 
# 
# 
# [[3]]
# 
# Welch Two Sample t-test
# 
# data:  data by Group
# t = -5.4235, df = 101.12, p-value = 4.001e-07
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
#   -1.2171386 -0.5652189
# sample estimates:
#   mean in group ctrl   mean in group x3 
# -0.4472136          0.4439652

PS : It's always interesting to work with p-values and alpha corrections. PS:使用p值和alpha校正总是很有趣。 It's a bit of a philosophical issue how to approach that and some people agree and other disagree. 如何解决这个问题有点哲学上的问题,有些人同意而另一些人不同意。 Personally, I tend to correct alpha based on all possible comparison I can do after an experiment, because you never know when you'll come back to investigate other pairs. 就我个人而言,我倾向于根据实验后可能进行的所有比较来校正Alpha,因为您永远不知道何时会回来调查其他对。 Imagine what happens if in the future people decide that you have to go back and compare the winning group (let's say x1) with x2 and x3. 想象一下,如果将来人们决定您必须回过头来将获胜组(假设x1)与x2和x3进行比较,会发生什么。 You'll focus on those pairs and you'll again correct alpha based on those compariosns. 您将专注于这些对,然后再次基于这些比较来更正Alpha。 But on the whole you performed all possible comparisons, apart from x2 vs x3! 但是总的来说,除了x2与x3之外,您还进行了所有可能的比较! You may write your reports or publish findings that should have been a bit more strict on the alpha correction. 您可以编写报告或发布对alpha校正应该更严格一些的发现。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM