简体   繁体   English

使用 tmerge 或 SurvSplit 重塑纵向数据集?

[英]Reshaping longitudinal dataset with tmerge or SurvSplit?

I'm attempting to conduct survival analysis with time-varying covariates.我正在尝试使用时变协变量进行生存分析。 The data comes from a longitudinal survey that is administered annually, and I have manipulated it to look like this:数据来自一项每年进行的纵向调查,我将其处理成如下所示:

id  event       end.time    income1      income2    income3     income4
1   1           3           8            10         13          8       
2   0           4           13           15         24          35

event indicates whether the event occurred or not, end.time is the time to event, and I have my time-varying covariates for each subsequent period to the right. event 表示事件是否发生,end.time 是事件发生的时间,我在右边的每个后续时间段都有随时间变化的协变量。 So, for observation 1, the event occurred at year 3, and during year 1, they earned an income of 8 thousand dollars, etc. For observation 2, the event is censored, and we have data up to year 4 (when the study ends).因此,对于观察 1,事件发生在第 3 年,在第 1 年,他们获得了 8000 美元的收入,等等。对于观察 2,事件被截尾,我们有直到第 4 年的数据(当研究结束)。

In the end, I'd like my data to look something like this:最后,我希望我的数据看起来像这样:

id  st.time end.time    event   inc

1   0       1           0       8
1   1       2           0       10
1   2       3           1       13
2   0       1           0       13
2   1       2           0       15
2   2       3           0       24
2   3       4           0       35

I've looked up the tmerge() and SurvSplit() functions but am unsure of how to apply them in this specific situation.我查看了 tmerge() 和 SurvSplit() 函数,但不确定如何在这种特定情况下应用它们。 It seems that with SurvSplit(), I could use the cutpoints by year, but not sure how it would reshape the time-varying covariates.似乎使用 SurvSplit(),我可以按年使用分割点,但不确定它将如何重塑时变协变量。

It might be the case that using a generic reshape might work better?可能是使用通用重塑可能效果更好?

Any advice would be appreciated.任何意见,将不胜感激。

Probably a general reshape along with some manipulation with dplyr would work.可能进行一般重塑以及对dplyr进行一些操作会起作用。

library(dplyr)

df %>%
  tidyr::pivot_longer(cols = starts_with('income'), values_to = 'inc') %>%
  group_by(id) %>%
  slice(1:first(end.time)) %>%
  mutate(end.time = row_number(),
         st.time = end.time - 1,
         event = replace(event, -n(), 0)) %>%
  select(-name)


#     id event end.time   inc st.time
#  <int> <dbl>    <dbl> <int>   <dbl>
#1     1     0        1     8       0
#2     1     0        2    10       1
#3     1     1        3    13       2
#4     2     0        1    13       0
#5     2     0        2    15       1
#6     2     0        3    24       2
#7     2     0        4    35       3

data数据

df <- structure(list(id = 1:2, event = 1:0, end.time = 3:4, income1 = c(8L, 
13L), income2 = c(10L, 15L), income3 = c(13L, 24L), income4 = c(8L, 
35L)), class = "data.frame", row.names = c(NA, -2L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM