简体   繁体   English

如何在R的“调查”数据包中按年龄组复制SUDAAN的第75个百分位和95%的置信区间?

[英]How to replicate SUDAAN 75th percentile and 95% confidence intervals by age groups in R's 'survey' package?

I'm trying to replicate quantile estimates with 95% confidence intervals by age groups from SAS and SUDAAN in the 'survey' package in R with NHANES data. 我正在尝试使用NHANES数据在R的“调查”数据包中按年龄组从SAS和SUDAAN复制具有95%置信区间的分位数估计。 The package's 'svyby' function combined with its 'svyquantile' function allow you to perform this analysis quite easily; 软件包的“ svyby”功能与其“ svyquantile”功能相结合,使您可以轻松地执行此分析。 my results are close but not exactly the same as the results generated by SUDAAN. 我的结果接近但与SUDAAN产生的结果不完全相同。

I believe this may be due to a number of arguments the 'svyby' and 'svyquantile' functions allow you customize. 我相信这可能是由于“ svyby”和“ svyquantile”功能允许您自定义的许多参数所致。 The arguments the 'svyquantile' function takes include 'method', 'interval.type', 'ties, 'interval.type', 'return.replicates', etc. “ svyquantile”函数采用的参数包括“方法”,“ interval.type”,“关系”,“ interval.type”,“ return.replicates”等。

I've found an this article which explains how to replicate some SUDAAN functions with the 'survey' package, but does not explain how to replicate quantile estimates. 我发现这篇文章解释了如何使用“调查”包复制某些SUDAAN函数,但没有解释如何复制分位数估计。 Through some research on how SUDAAN estimates quantiles, I believe the 'method' argument should be set to 'linear'. 通过对SUDAAN如何估计分位数的一些研究,我认为应将“方法”论点设置为“线性”。 Besides that, I've tried setting the various arguments to different parameters, but have not had luck replicating the SUDAAN estimates exactly. 除此之外,我尝试将各种参数设置为不同的参数,但是还没有运气准确地复制SUDAAN估算值。

Does anyone know how to replicate SUDAAN quantile estimates and 95% confidence intervals by groups, or have any documentation on the methodology SUDAAN uses in order to better replicate this analysis using the 'survey' package in R? 有谁知道如何按组复制SUDAAN分位数估算值和95%的置信区间,或者是否有任何有关SUDAAN使用的方法的文档,以便使用R中的“调查”包更好地复制此分析?

In the code below, I've shown my approach. 在下面的代码中,我展示了我的方法。 The results of the 'svyby' function seem like reasonable estimates, however, they are not identical to the results produced by SUDAAN and SAS. “ svyby”函数的结果似乎是合理的估计,但是,它们与SUDAAN和SAS产生的结果并不相同。 I don't have access to SUDAAN and SAS, but my objective is to replicate their results in R. Specifically, the 75th percentile for the 60+ age group according to SUDAAN and SAS for PCB 118 is 25.89 ng/g lipid (95% CI: 22.97-30.17). 我无法使用SUDAAN和SAS,但我的目标是在R中复制他们的结果。具体而言,根据SUDAAN和SAS对于PCB 118的60岁以上年龄组的第75个百分位数是25.89 ng / g脂质(95% CI:22.97-30.17)。 Thank you. 谢谢。

library(RNHANES)
library(survey)

# import NHANES 2003-2004 PCB Dataset 
pcbs <- nhanes_load_data("L28DFP_C", "2003-2004", demographics = T)

# create appropriate age groups
pcbs$age <- ifelse(pcbs$RIDAGEYR < 20, "<20",
            ifelse(pcbs$RIDAGEYR >= 20 & pcbs$RIDAGEYR <= 39, "20-39",
            ifelse(pcbs$RIDAGEYR >= 40 & pcbs$RIDAGEYR <= 59, "40-59",
            ifelse(pcbs$RIDAGEYR >= 60, "60+", ""))))
pcbs$age <- as.factor(pcbs$age)
levels(pcbs$age) = c("<20", "20-39", "40-59", "60+")

# assign survey design
nhanes.dsgn <- svydesign(id = ~SDMVPSU, strata = ~SDMVSTRA , weights = ~ WTSC2YR, data = pcbs, nest = TRUE)

# quantiles for subpopulations
svyby(~LBX118LA, ~age, nhanes.dsgn, svyquantile, quantiles=0.75, ci=TRUE, alpha=0.05,vartype="ci", na.rm=T, method = "linear")

From the documentation on the 'survey' package: "Combining interval.type="betaWald" and ties="discrete" is (close to) the proposal of Shah and Vaish(2006) used in some versions of SUDAAN.” 从“调查”包的文档中可以得出:“ Combining interval.type =“ betaWald”和ties =“ discrete”(接近)Shah和Vaish(2006)在某些版本的SUDAAN中使用的建议。”

So, 所以,

PCB118LA <- svyby(~LBX118LA, ~age, nhanes.dsgn, svyquantile, quantiles = 0.75, ci=TRUE, alpha=0.05, vartype="ci", na.rm=T, method = "linear", ties = "discrete", interval.type="betaWald")

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何创建 function 以显示表 1 package 中的第 25 和第 75 个百分位数 (IQR) - How to create a function to display the 25th and 75th percentile (IQR) in table1 package 有没有办法在 R 的 SummaryBy 函数中获得第 75 个百分位数? - Is there a way to get the 75th percentile in the SummaryBy function in R? R 调查 package 如何计算自举置信区间? - How does the R Survey package compute bootstrap confidence intervals? 在 R 中以 95% 的置信区间绘制密度图 - plot density plots with confidence intervals of 95% in R 如何在R中创建一个箱形图,其中框表示第15个和第85个百分位数,而不是默认的第25个和第75个百分位数? - How to create a boxplot in R, with box representing the 15th and 85th percentiles, rather than the default 25th and 75th? 绘制比值比和95%置信区间 - Plotting odds ratio's and 95% confidence intervals 两组之间的绝对差异及其在 R 中每行的 95% 置信区间,并将其添加到特定列中的相应行 - absolute differences between 2 groups and their 95% confidence intervals in R for each row and add that to corresponding row in a specific column 如何用R中的第5和第95百分位值替换异常值 - How to replace outliers with the 5th and 95th percentile values in R 如何使用bayesboot()计算95%的置信区间 - How to calculate 95% confidence intervals using bayesboot() 在 r 中使用 survminer package 将风险表和 95% 置信区间添加到调整后的生存曲线 - Add at risk table and 95% confidence intervals to adjusted survival curves using survminer package in r
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM