tmerge function in R 對於時間相關的協變量

Question

我有 tibbles df1 和 df2，我想從使用 dplyr 操作的那些創建 df_temp。 該應用程序用於在延遲進入且 start_time 為年齡的生存 model 中實現時變協變量。 有沒有人有使用 dplyr 或 tmerge 的解決方案？

library(dplyr)
library(magrittr)
library(survival)


df1 =
  tibble(id = c(1,2,3),
         start_time = c(5,10,15),
         stop_time = c(8,17,25),
         event = c(1,1,0))


df2 = tibble(
         id = c(1,2,3),
         stop_time_cancer = c(6, NA, 20),
         cancer_status = c(1,0,1))


df_temp <- tibble(
  id = c(1,1,2,3,3),
  start_time = c(5,6,10,15,20),
  stop_time = c(6,8,17,20,25), 
  cancer_event = c(0, 1, 0, 0, 1),
  event = c(0,1, 1, 0, 0)
)

謝謝！

我嘗試使用 tmerge function 來完成它，但由於我延遲了輸入，所以我無法讓它工作。

Answer 1

這目前將fuzzyjoin用於非等值連接機制（根據我對問題集的解釋是必需的）。 當 dplyr-1.1.0 發布時，這很可能通過其join_by功能來完成（參考： https://www.tidyverse.org/blog/2022/11/dplyr-1-1-0-is-coming-soon /#加入改進）。

# library(fuzzyjoin)
out <- fuzzyjoin::fuzzy_left_join(
  df1, df2,
  by = c(id="id", start_time="stop_time_cancer", stop_time="stop_time_cancer"), 
  match_fun = list(`==`, `<=`, `>=`)
  ) %>%
  rowwise() %>%
  summarize(
    id = id.x,
    start_time = c(start_time, na.omit(stop_time_cancer)),
    stop_time = sort(c(na.omit(stop_time_cancer), stop_time)),
    event = c(if (!is.na(stop_time_cancer)) 0, event),
    cancer_event = c(0, if (!is.na(stop_time_cancer)) 1)
  )
out
# # A tibble: 5 × 5
#      id start_time stop_time event cancer_event
#   <dbl>      <dbl>     <dbl> <dbl>        <dbl>
# 1     1          5         6     0            0
# 2     1          6         8     1            1
# 3     2         10        17     1            0
# 4     3         15        20     0            0
# 5     3         20        25     0            1

確認：

all.equal(df_temp, out[,names(df_temp)])
# [1] TRUE

tmerge function in R 對於時間相關的協變量

問題描述

1 個解決方案

解決方案1
0 2022-12-31 21:57:12

tmerge function in R 對於時間相關的協變量

問題描述

1 個解決方案

解決方案1 0 2022-12-31 21:57:12

解決方案1
0 2022-12-31 21:57:12