[英]R- How to conduct two-sample t-test with two different survey designs
I want to perform a two-sample (welch's) t-test on the equality of two means, one of which is obtained using simple random sampling ( srsmean
), and the other which is calculated using survey weighting with the survey package ( mean_weighted
).我想对两种均值的相等性进行两个样本(welch 的)t 检验,其中一个是使用简单随机抽样( srsmean
)获得的,另一个是使用调查 package ( mean_weighted
)的调查加权计算的. I also conduct a t-test between mean_weighted
and the mean obtained when weighting and stratification are both implemented in the survey design ( mean_strat
).我还在mean_weighted
和在调查设计( mean_strat
)中实施加权和分层时获得的平均值之间进行了 t 检验。
I know there is a svyttest()
function, however, as far as I can tell, this function only tests the means of two samples within one survey design, not means obtained with different survey designs.我知道有一个svyttest()
function,但是,据我所知,这个 function 只测试一个调查设计中两个样本的均值,而不是通过不同调查设计获得的均值。
I also tried using rnorm to create fictional samples eg c(rnorm(9710, mean = 156958.8, sd = 364368))
, but the problem with this approach is that in complex sampling methods like stratification, the effective n is usually smaller than the nominal n, and so I am unsure what to put as n.我还尝试使用 rnorm 创建虚构样本,例如c(rnorm(9710, mean = 156958.8, sd = 364368))
,但这种方法的问题在于,在分层等复杂抽样方法中,有效n 通常小于标称n,所以我不确定应该把什么写成 n。 Additionally, this method feels a bit contrived, as I would be fitting the data to a particular type of distribution.此外,这种方法感觉有点做作,因为我会将数据拟合到特定类型的分布。
Finally, I tried writing out the equation for a t-statistic myself, however, in calculating the "standard error of the difference of means" involving a complex survey design, I also run into problems related to the "effective sample size."最后,我尝试自己写出 t 统计量的方程,但是,在计算涉及复杂调查设计的“均值差的标准误差”时,我也遇到了与“有效样本量”相关的问题。
Is there another approach that would work for both the t-test between srsmean, mean_weighted
AND the t-test between mean_weighted, mean_strat
?是否有另一种方法适用于srsmean, mean_weighted
之间的 t 检验和mean_weighted, mean_strat
之间的 t 检验?
library(survey)
wel <- c(68008.19, 128504.61, 21347.69,
33272.95, 61828.96, 32764.44,
92545.62, 58431.89, 95596.82,
117734.27)
rmul <- c(16, 16, 16, 16, 16, 16, 16,
20, 20, 20)
strat <- c(101, 101, 101, 101, 101, 102, 102, 102, 102, 102)
survey.data <- data.frame(wel, rmul, strat)
srsmean <- mean(survey.data$wel)
survey_weighted <- svydesign(data = survey.data,
ids = ~wel,
weights = ~rmul,
nest = TRUE)
mean_weighted <- svymean(~wel, survey_weighted)
survey_strat <- survey_strat <- svydesign(data = surveydata,
ids= ~wel,
weights = ~rmul,
strata = ~strat,
nest = TRUE)
mean_strat <- svymean(~wel, survey_strat)
i'm confused about the purpose of a t-test between your mean_weighted
and mean_strat
since the difference between those coefficients will always be zero?我对你的mean_weighted
和mean_strat
之间的 t 检验的目的感到困惑,因为这些系数之间的差异总是为零? i might compare the simple random sample against the complex design like this?我可能会将简单的随机样本与这样的复杂设计进行比较?
library(survey)
wel <- c(68008.19, 128504.61, 21347.69,
33272.95, 61828.96, 32764.44,
92545.62, 58431.89, 95596.82,
117734.27)
rmul <- c(16, 16, 16, 16, 16, 16, 16,
20, 20, 20)
strat <- c(101, 101, 101, 101, 101, 102, 102, 102, 102, 102)
survey.data <- data.frame(wel, rmul, strat)
survey_unweighted <- svydesign(data = survey.data,
ids = ~1)
mean_unweighted <- svymean(~wel, survey_unweighted)
survey_strat <- survey_strat <- svydesign(data = survey.data,
ids= ~wel,
weights = ~rmul,
strata = ~strat,
nest = TRUE)
mean_strat <- svymean(~wel, survey_strat)
coef_one <- coef( mean_unweighted )
coef_two <- coef( mean_strat )
se_one <- SE( mean_unweighted )
se_two <- SE( mean_strat )
t_statistic <- abs( coef_one - coef_two ) / sqrt ( se_one ^2 + se_two ^2 )
p_value <- ( 1 - pnorm( abs( coef_one - coef_two ) / sqrt( se_one ^2 + se_two ^2 ) ) ) * 2
sig_diff <- ifelse( 1 - pnorm( abs( coef_one - coef_two ) / sqrt( se_one ^2 + se_two ^2 ) ) < 0.025 , "*" , "" )
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.