简体   繁体   English

在R中模拟t检验

[英]Simlulating a t-test in R

I am looking for a way to simulate the power of a simple t-test in different sample sizes. 我正在寻找一种方法来模拟不同样本量的简单t检验的功效。 My idea is to generate 400 random normal distribution samples, each with mean 5 and variance 1, and perform a t-test on each one concerning the hypothesis that the true mean is 4, ie the t-test would take the form: 我的想法是生成400个随机正态分布样本,每个样本具有均值5和方差1,并对每个样本执行t检验,假设真实均值为4,即t检验将采用以下形式:

t=(mean(x)-4)*sqrt(n)/sd(x) # for each sample x which consists of n observations. 对于每个样本x,t =(mean(x)-4)* sqrt(n)/ sd(x)#,其由n个观察值组成。

For comparison I would like, the first 100 samples to consist of 10 observations, the next 100 ones of 100, the next 100 of 1000 and finally the last 100 of 5000, which I think is the upper limit. 为了进行比较,我想,前100个样本由10个观察组成,接下来100个100个,1000个,接下来100个1000个,最后是5000个,我认为是上限。 A t-test will have to be performed on each and every sample. 必须对每个样品进行t检验。

Lastly, I would like to see on what percentage of each sample group- let's call them, n10,n100,n1000,n5000, depending on how many observations they comprise- my (false) hypothesis is rejected. 最后,我想看看每个样本组的百分比 - 让我们称之为n10,n100,n1000,n5000,这取决于他们构成的观察数量 - 我的(假)假设被拒绝。

Could you please help me write the corresponding R-code? 你能帮我写一下相应的R代码吗? I know the small commands but have trouble putting it all together. 我知道小命令,却无法将它们放在一起。 This is a nice exercise and hopefully I shall then be able to modify it a bit and use it for different purposes as well. 这是一个很好的练习,希望我能够稍微修改它并将它用于不同的目的。

Thank you in advance. 先感谢您。

Here's a one liner for 400 t.tests of n=10: 这是一个内衬400 t.tes的n = 10:

R>simulations <- replicate(400, t.test(rnorm(10, mean=5, sd=1), mu=4),
                           simplify=FALSE);

Then you can analyze it: 然后你可以分析它:

R>table(sapply(simulations, "[[", "p.value") < .05)
FALSE  TRUE 
   75   325 

I'm still learning R, too, so handle with care: 我也在学习R,所以小心处理:

n <- 5
N <- 100
samplesizes <- as.integer(10^(1:n))
set.seed(1) 

# generate samples
samples <- replicate(N, mapply(rnorm, samplesizes, mean=4, sd=sqrt(1)))

# perform t-tests
t.tests <- lapply(samples, function(x) t.test(x, mu=5, alternative="two.sided"))

# get p-values
t.test.pvalues <- sapply(t.tests, function(x) x$p.value)

rejected <- t.test.pvalues > .05
sampleIndices <- rep(1:n, N)
res <- aggregate(rejected, list(sample=sampleIndices), FUN=function(x) sum(x)/length(x) )
names(res)[2] <- "percRejected"
print(res, row.names=F)
# sample percRejected
# 1         0.16
# 2         0.00
# 3         0.00
# 4         0.00
# 5         0.00

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM