简体   繁体   English

两个数据集之间均值差的置信区间

[英]Confidence Interval of Difference of Means between two datasets

I'm working on two datasets, derrived fromm cats , an in-build R dataset. 我正在开发两个数据集,来自于cats ,一个内置的R数据集。

> cats
    Sex Bwt  Hwt
1     F 2.0  7.0
2     F 2.0  7.4
3     F 2.0  9.5
4     F 2.1  7.2
5     F 2.1  7.3
6     F 2.1  7.6
7     F 2.1  8.1
8     F 2.1  8.2
9     F 2.1  8.3
10    F 2.1  8.5
11    F 2.1  8.7
12    F 2.1  9.8
...
137   M 3.6 13.3
138   M 3.6 14.8
139   M 3.6 15.0
140   M 3.7 11.0
141   M 3.8 14.8
142   M 3.8 16.8
143   M 3.9 14.4
144   M 3.9 20.5

I want to find the 99% Confidence Interval on the difference of means values between the Bwt of Male and Female specimens (Sex == M and Sex == F respectively) 我想找到男性和女性标本Bwt之间平均值差异99%置信区间 (性别== M和性别= = F)

I know that t.test does this, among other things, but if I break up cats to two datasets that contain the Bwt of Males and Females, t.test() complains that the two datasets are not of the same length, which is true. 我知道, t.test做到这一点,除其他事项外,但如果我分手cats到包含两个数据集Bwt男性和女性,t.test()抱怨说,两个数据集是不一样的长度,这是真正。 There's only 47 Females in cats , and 87 Males. cats只有47只雌性,还有87只雄性。

Is it doable some other way or am I misinterpreting data by breaking them up? 是否可以通过其他方式实现,还是通过分解数据来误解数据?

EDIT: I have a function suggested to me by an Answerer on another Question that gets the CI of means on a dataset, may come in handy: 编辑:我有一个回答者在另一个问题上向我建议的函数,它可以获得数据集中的均值CI,可能会派上用场:

ci_func <- function(data, ALPHA){
  c(
    mean(data) - qnorm(1-ALPHA/2) * sd(data)/sqrt(length(data)),
    mean(data) + qnorm(1-ALPHA/2) * sd(data)/sqrt(length(data))
    )
}

您应该使用公式接口应用t.test:

t.test(Bwt ~ Sex, data=cats, conf.level=.99)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM