简体   繁体   English

R中用于对过程表单数据进行计数的分层对数秩检验?

[英]Stratified log-rank test in R for counting process form data?

Background : at half-year follow up times for 4y, patients may switch to a different medication group. 背景 :在半年的随访时间(4年)中,患者可能会改用其他药物治疗组。 To account for this, I've converted survival data into counting process form. 为了解决这个问题,我将生存数据转换为计数过程形式。 I want to compare survival curves for medication groups A, B, and C. I am using an extended Cox model but want to do pairwise comparisons of each hazard function or do stratified log-rank tests. 我想比较药物组A,B和C的生存曲线。我正在使用扩展的Cox模型,但想对每个危险函数进行成对比较或进行分层对数秩检验。 pairwise_survdiff throws an error because of the form of my data, I think. 我认为pairwise_survdiff由于我的数据形式而引发错误。

Example data : 示例数据

x<-data.frame(tstart=rep(seq(0,18,6),3),tstop=rep(seq(6,24,6),3), rx = rep(c("A","B","C"),4), death=c(rep(0,11),1))
x

Problem : 问题

When using survdiff in the survival package, survival包中使用survdiff时,

survdiff(Surv(tstart,tstop,death) ~ rx, data = x)

I get the error: 我得到错误:

Error in survdiff(Surv(tstart, tstop, death) ~ rx, data = x) : 
  Right censored data only

I think this stems from the counting process form, since I can't find an example online that compares survival curves for time-varying covariates. 我认为这源于计数过程,因为我找不到在线的示例来比较生存曲线随时间变化的协变量。

Question : is there a quick fix to this problem? 问题 :有快速解决此问题的方法吗? Or, is there an alternative package/function with the same versatility to compare survival curves, namely using different methods? 或者,是否有一个具有相同通用性的替代套件/功能,可以比较生存曲线,即使用不同的方法? How can I implement stratified log-rank tests using survidff on counting process form data? 如何在计数过程表单数据时使用survidff实施分层对数等级测试?

NOTE : this was marked as a known issue in the survminer package, see github issue here, but updating survminer did not solve my issue, and using one time interval, tstop-tstart wouldn't be correct, since that would leave, eg, multiple entries at 6 months rather than out to the actual interval of risk. 注意 :这在survminer软件包中被标记为已知问题,请参见github问题,但是更新survminer不能解决我的问题,并且使用一个时间间隔,tstop-tstart并不正确,因为那样会留下例如在6个月内多次输入,而不是超出实际风险间隔。

So, here is an example of fitting the model and making the multiple comparisons using multcomp package. 因此,这是一个使用multcomp软件包拟合模型并进行多次比较的示例。 Note that this implicitly assumes that administration of treatments AC is random. 注意,这隐含地假设治疗AC的施用是随机的。 Depending on the assumptions about the process, it might be better to fit a multistate model with transitions between treatments and outcome. 根据有关过程的假设,可能更适合在治疗和结果之间进行转换的多状态模型。

library(purrr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(survival)
library(multcomp)
#> Loading required package: mvtnorm
#> Loading required package: TH.data
#> Loading required package: MASS
#> 
#> Attaching package: 'MASS'
#> The following object is masked from 'package:dplyr':
#> 
#>     select
#> 
#> Attaching package: 'TH.data'
#> The following object is masked from 'package:MASS':
#> 
#>     geyser
# simulate survival data
set.seed(123)
n <- 200
df <- data.frame(
  id = rep(1:n, each = 8),
  start = rep(seq(0, 42, by = 6), times = 8),
  stop = rep(seq(6, 48, by = 6), times = 8),
  rx = sample(LETTERS[1:3], n * 8, replace = T))
df$hazard <- exp(-3.5  -1 * (df$rx == "A") + .5 * (df$rx == "B") +
  .5 * (df$rx == "C"))

df_surv <- data.frame(id = 1:n)
df_surv$time <- split(df, f = df$id) %>%
  map_dbl(~msm::rpexp(n = 1, rate = .x$hazard, t = .x$start))

df <- df %>% left_join(df_surv)
#> Joining, by = "id"
df <- df %>%
  mutate(status = 1L * (time <= stop)) %>%
  filter(start <= time)
df %>% head()
#>   id start stop rx     hazard     time status
#> 1  1     0    6  A 0.01110900 13.78217      0
#> 2  1     6   12  C 0.04978707 13.78217      0
#> 3  1    12   18  B 0.04978707 13.78217      1
#> 4  2     0    6  B 0.04978707 22.37251      0
#> 5  2     6   12  B 0.04978707 22.37251      0
#> 6  2    12   18  C 0.04978707 22.37251      0

# fit the model 
model <- coxph(Surv(start, stop, status)~rx, data = df)

# define pairwise comparison
glht_rx <- multcomp::glht(model, linfct=multcomp::mcp(rx="Tukey"))
glht_rx
#> 
#>   General Linear Hypotheses
#> 
#> Multiple Comparisons of Means: Tukey Contrasts
#> 
#> 
#> Linear Hypotheses:
#>            Estimate
#> B - A == 0  1.68722
#> C - A == 0  1.60902
#> C - B == 0 -0.07819

# perform multiple comparisons 
# (adjusts for multiple comparisons + takes into account correlation of coefficients -> more power than e.g. bonferroni)
smry_rx <- summary(glht_rx)
smry_rx # -> B and C different to A, but not from each other
#> 
#>   Simultaneous Tests for General Linear Hypotheses
#> 
#> Multiple Comparisons of Means: Tukey Contrasts
#> 
#> 
#> Fit: coxph(formula = Surv(start, stop, status) ~ rx, data = df)
#> 
#> Linear Hypotheses:
#>            Estimate Std. Error z value Pr(>|z|)    
#> B - A == 0  1.68722    0.28315   5.959   <1e-05 ***
#> C - A == 0  1.60902    0.28405   5.665   <1e-05 ***
#> C - B == 0 -0.07819    0.16509  -0.474     0.88    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> (Adjusted p values reported -- single-step method)
# confidence intervals
plot(smry_rx)

Created on 2019-04-01 by the reprex package (v0.2.1) reprex软件包 (v0.2.1)创建于2019-04-01

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM