[英]Problem with gtsummary tbl_stack and tbl_svysummary for continuous weight variables
[英]How to use gtsummary::tbl_svysummary() to display confidence intervals for levels of a factor variable?
我正在使用来自国家电子伤害监测系统 ( https://www.cpsc.gov/Research--Statistics/NEISS-Injury-Data ) 的调查数据来研究消费品伤害的趋势。
使用 gtsummary 和 tbl_svysummary(),我的目标是创建一个描述性的伤害总结测量表。 由于这是调查数据,我想显示与每个汇总度量相关的 95% 置信区间。
上一篇文章提供了为两个水平因子变量生成置信区间的解决方案( 使用 (gtsummary) tbl_svysummaary function 来显示survey.design object? 的置信区间? ),但是,我正在寻找一种解决方案来生成因子变量的置信区间>=2 级。
我从上一篇文章中借用了一个可重现的例子:
library(gtsummary)
library(survey)
svy_trial <-
svydesign(~1, data = trial %>% select(trt, response, death), weights = ~1)
ci <- function(variable, by, data, ...) {
svyby(as.formula( paste0( "~" , variable)) , by = as.formula( paste0( "~" , by)), data, svyciprop, vartype="ci") %>%
tibble::as_tibble() %>%
dplyr::mutate_at(vars(ci_l, ci_u), ~style_number(., scale = 100) %>% paste0("%")) %>%
dplyr::mutate(ci = stringr::str_glue("{ci_l}, {ci_u}")) %>%
dplyr::select(all_of(c(by, "ci"))) %>%
tidyr::pivot_wider(names_from = all_of(by), values_from = ci) %>%
set_names(paste0("add_stat_", seq_len(ncol(.))))
}
ci("response", "trt", svy_trial)
#> # A tibble: 1 x 2
#> add_stat_1 add_stat_2
#> <glue> <glue>
#> 1 21%, 40% 25%, 44%
svy_trial %>%
tbl_svysummary(by = "trt", missing = "no") %>%
add_stat(everything() ~ "ci") %>%
modify_table_body(
dplyr::relocate, add_stat_1, .after = stat_1
) %>%
modify_header(starts_with("add_stat_") ~ "**95% CI**") %>%
modify_footnote(everything() ~ NA)
上一篇文章的表格截图1
在上面的示例中,因子变量有两个级别,并且显示了来自 1 个级别的汇总数据。
这是我要生成的表的示例:所需表的屏幕截图 output 2
如果此帮助请求没有遵循良好的堆栈溢出礼仪(我对这个社区相当陌生),我们深表歉意,非常感谢您的时间和帮助!
我为具有 >=2 级别的因子准备了一个示例,但没有使用by=
变量(尽管方法相似)。 仅供参考,我们有一个未解决的问题,可以使用新的 function add_ci.tbl_svysummary()
更彻底地支持调查对象,它将计算分类变量和连续变量的 CI。 您可以单击此处的“订阅”链接以在实现此功能时收到警报 https://github.com/ddsjoberg/gtsummary/issues/965
同时,这是一个代码示例:
library(gtsummary)
library(tidyverse)
packageVersion("gtsummary")
#> [1] '1.5.0'
svy <- survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq)
# put the CI in a tibble with the variable name
# first create a data frame with each variable and it's values
df_result <-
tibble(variable = c("Class", "Sex", "Age", "Survived")) %>%
# get the levels of each variable in a new column
# adding them as a list to allow for different variable classes
rowwise() %>%
mutate(
# level to be used to construct call
level = unique(svy$variables[[variable]]) %>% as.list() %>% list(),
# character version to be merged into table
label = unique(svy$variables[[variable]]) %>% as.character() %>% as.list() %>% list()
) %>%
unnest(c(level, label)) %>%
mutate(
label = unlist(label)
)
# construct call to svyciprop
df_result$svyciprop <-
map2(
df_result$variable, df_result$label,
function(variable, level) rlang::inject(survey::svyciprop(~I(!!rlang::sym(variable) == !!level), svy))
)
# round/format the 95% CI
df_result <-
df_result %>%
rowwise() %>%
mutate(
ci =
svyciprop %>%
attr("ci") %>%
style_sigfig(scale = 100) %>%
paste0("%", collapse = ", ")
) %>%
ungroup() %>%
# keep variables needed in tbl
select(variable, label, ci)
# construct gtsummary table with CI
tbl <-
svy %>%
tbl_svysummary() %>%
# merge in CI
modify_table_body(
~.x %>%
left_join(
df_result,
by = c("variable", "label")
)
) %>%
# add a header
modify_header(ci = "**95% CI**")
由reprex package (v2.0.1) 于 2021 年 12 月 4 日创建
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.