简体   繁体   English

R-如何使用两种不同的调查设计进行两样本 t 检验

[英]R- How to conduct two-sample t-test with two different survey designs

I want to perform a two-sample (welch's) t-test on the equality of two means, one of which is obtained using simple random sampling ( srsmean ), and the other which is calculated using survey weighting with the survey package ( mean_weighted ).我想对两种均值的相等性进行两个样本(welch 的)t 检验,其中一个是使用简单随机抽样( srsmean )获得的,另一个是使用调查 package ( mean_weighted )的调查加权计算的. I also conduct a t-test between mean_weighted and the mean obtained when weighting and stratification are both implemented in the survey design ( mean_strat ).我还在mean_weighted和在调查设计( mean_strat )中实施加权和分层时获得的平均值之间进行了 t 检验。

I know there is a svyttest() function, however, as far as I can tell, this function only tests the means of two samples within one survey design, not means obtained with different survey designs.我知道有一个svyttest() function,但是,据我所知,这个 function 只测试一个调查设计中两个样本均值,而不是通过不同调查设计获得的均值。

I also tried using rnorm to create fictional samples eg c(rnorm(9710, mean = 156958.8, sd = 364368)) , but the problem with this approach is that in complex sampling methods like stratification, the effective n is usually smaller than the nominal n, and so I am unsure what to put as n.我还尝试使用 rnorm 创建虚构样本,例如c(rnorm(9710, mean = 156958.8, sd = 364368)) ,但这种方法的问题在于,在分层等复杂抽样方法中,有效n 通常小于标称n,所以我不确定应该把什么写成 n。 Additionally, this method feels a bit contrived, as I would be fitting the data to a particular type of distribution.此外,这种方法感觉有点做作,因为我会将数据拟合到特定类型的分布。

Finally, I tried writing out the equation for a t-statistic myself, however, in calculating the "standard error of the difference of means" involving a complex survey design, I also run into problems related to the "effective sample size."最后,我尝试自己写出 t 统计量的方程,但是,在计算涉及复杂调查设计的“均值差的标准误差”时,我也遇到了与“有效样本量”相关的问题。

Is there another approach that would work for both the t-test between srsmean, mean_weighted AND the t-test between mean_weighted, mean_strat ?是否有另一种方法适用于srsmean, mean_weighted之间的 t 检验和mean_weighted, mean_strat之间的 t 检验?

library(survey)

wel <- c(68008.19, 128504.61,  21347.69,
         33272.95,  61828.96,  32764.44,
         92545.62,  58431.89,  95596.82,
         117734.27)
rmul <- c(16, 16, 16, 16, 16, 16, 16,
          20, 20, 20)
strat <- c(101, 101, 101, 101, 101, 102, 102, 102, 102, 102)


survey.data <- data.frame(wel, rmul, strat)

srsmean <- mean(survey.data$wel)

survey_weighted <- svydesign(data = survey.data,
                             ids = ~wel, 
                             weights = ~rmul, 
                             nest = TRUE)

mean_weighted <- svymean(~wel, survey_weighted)

survey_strat <- survey_strat <- svydesign(data = surveydata, 
                                          ids= ~wel, 
                                          weights = ~rmul, 
                                          strata = ~strat, 
                                          nest = TRUE)
mean_strat <- svymean(~wel, survey_strat)

i'm confused about the purpose of a t-test between your mean_weighted and mean_strat since the difference between those coefficients will always be zero?我对你的mean_weightedmean_strat之间的 t 检验的目的感到困惑,因为这些系数之间的差异总是为零? i might compare the simple random sample against the complex design like this?我可能会将简单的随机样本与这样的复杂设计进行比较?

library(survey)

wel <- c(68008.19, 128504.61,  21347.69,
         33272.95,  61828.96,  32764.44,
         92545.62,  58431.89,  95596.82,
         117734.27)
rmul <- c(16, 16, 16, 16, 16, 16, 16,
          20, 20, 20)
strat <- c(101, 101, 101, 101, 101, 102, 102, 102, 102, 102)

survey.data <- data.frame(wel, rmul, strat)

survey_unweighted <- svydesign(data = survey.data,
                             ids = ~1)

mean_unweighted <- svymean(~wel, survey_unweighted)

survey_strat <- survey_strat <- svydesign(data = survey.data, 
                                          ids= ~wel, 
                                          weights = ~rmul, 
                                          strata = ~strat, 
                                          nest = TRUE)
mean_strat <- svymean(~wel, survey_strat)


coef_one <- coef( mean_unweighted )
coef_two <- coef( mean_strat )
se_one <- SE( mean_unweighted )
se_two <- SE( mean_strat )

t_statistic <- abs( coef_one - coef_two ) / sqrt ( se_one ^2 + se_two ^2 )
p_value <- ( 1 - pnorm( abs( coef_one - coef_two ) / sqrt( se_one ^2 + se_two ^2 ) ) ) * 2
sig_diff <- ifelse( 1 - pnorm( abs( coef_one - coef_two ) / sqrt( se_one ^2 + se_two ^2 ) ) < 0.025 , "*" , "" )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM