[英]labelling of ordered factor variable
我正在尝试使用gtsummary
包生成单变量输出表。
structure(list(id = 1:10, age = structure(c(3L, 3L, 2L, 3L, 2L,
2L, 2L, 1L, 1L, 1L), .Label = c("c", "b", "a"), class = c("ordered",
"factor")), sex = structure(c(2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 2L), .Label = c("F", "M"), class = "factor"), country = structure(c(1L,
1L, 1L, 1L, 3L, 3L, 3L, 2L, 2L, 2L), .Label = c("eng", "scot",
"wale"), class = "factor"), edu = structure(c(1L, 1L, 1L, 2L,
2L, 2L, 3L, 3L, 3L, 3L), .Label = c("x", "y", "z"), class = "factor"),
lungfunction = c(45L, 23L, 25L, 45L, 70L, 69L, 90L, 50L,
62L, 45L), ivdays = c(15L, 26L, 36L, 34L, 2L, 4L, 5L, 8L,
9L, 15L), no2 = c(40L, 70L, 50L, 60L, 30L, 25L, 80L, 89L,
10L, 40L), pm25 = c(15L, 20L, 36L, 48L, 25L, 36L, 28L, 15L,
25L, 15L)), row.names = c(NA, 10L), class = "data.frame")
...
library(gtsummary)
publication_dummytable1_sum %>%
select(sex,age,lungfunction,ivdays) %>%
tbl_uvregression(
method =lm,
y = lungfunction,
pvalue_fun = ~style_pvalue(.x, digits = 3)
) %>%
add_global_p() %>% # add global p-value
bold_p() %>% # bold p-values under a given threshold
bold_labels()
...
当我运行此代码时,我得到以下输出。 问题是有序因子变量( age
)的标签。 R 为有序因子变量选择自己的标签。 是否可以告诉 R 不要为有序因子变量选择自己的标签?
我想要如下输出:
像许多其他人一样,我认为您可能误解了 R 中“有序”因子的含义。从某种意义上说,R 中的所有因子都是有序的; 估计等通常按levels
向量的顺序打印、绘制等。 指定因子的类型为ordered
有两个主要影响:
filter(age > "b")
)L
(线性)和Q
(二次)标签的来源:参见例如这个 CrossValidated 答案以获取更多详细信息。如果您希望以与常规因素相同的方式处理此变量(以便对组与基线水平的差异进行估计,即处理对比),您可以:
factor(age, ordered=FALSE)
)contrasts = list(age = "contr.treatment")
)options(contrasts = c(unordered = "contr.treatment", ordered = "contr.treatment"))
( ordered
的默认值为 "contr.poly")如果您有一个无序(“常规”)因子并且级别不是您想要的顺序,您可以通过明确指定级别来重置级别顺序,例如
mutate(across(age, factor,
levels = c("0-10 years", "11-20 years", "21-30 years", "30-40 years")))
R 默认按字母顺序设置因子,这有时不是你想要的(但我想不出顺序是“随机”的情况......)
删除有序变量的奇数标签的最简单方法是从这些因子变量中删除有序类。 下面举例!
library(gtsummary)
library(tidyverse)
packageVersion("gtsummary")
#> [1] '1.4.2'
publication_dummytable1_sum <-
structure(list(id = 1:10, age = structure(c(3L, 3L, 2L, 3L, 2L,
2L, 2L, 1L, 1L, 1L), .Label = c("c", "b", "a"), class = c("ordered",
"factor")), sex = structure(c(2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 2L), .Label = c("F", "M"), class = "factor"), country = structure(c(1L,
1L, 1L, 1L, 3L, 3L, 3L, 2L, 2L, 2L), .Label = c("eng", "scot",
"wale"), class = "factor"), edu = structure(c(1L, 1L, 1L, 2L,
2L, 2L, 3L, 3L, 3L, 3L), .Label = c("x", "y", "z"), class = "factor"),
lungfunction = c(45L, 23L, 25L, 45L, 70L, 69L, 90L, 50L,
62L, 45L), ivdays = c(15L, 26L, 36L, 34L, 2L, 4L, 5L, 8L,
9L, 15L), no2 = c(40L, 70L, 50L, 60L, 30L, 25L, 80L, 89L,
10L, 40L), pm25 = c(15L, 20L, 36L, 48L, 25L, 36L, 28L, 15L,
25L, 15L)), row.names = c(NA, 10L), class = "data.frame") |>
as_tibble()
# R labels the order factors like this in lm()
lm(lungfunction ~ age, publication_dummytable1_sum)
#>
#> Call:
#> lm(formula = lungfunction ~ age, data = publication_dummytable1_sum)
#>
#> Coefficients:
#> (Intercept) age.L age.Q
#> 51.17 -10.37 -15.11
tbl <-
publication_dummytable1_sum %>%
# remove ordered class
mutate(across(where(is.ordered), ~factor(., ordered = FALSE))) %>%
select(sex,age,lungfunction,ivdays) %>%
tbl_uvregression(
method =lm,
y = lungfunction,
pvalue_fun = ~style_pvalue(.x, digits = 3)
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.