简体   繁体   中英

Reshaping longitudinal dataset with tmerge or SurvSplit?

I'm attempting to conduct survival analysis with time-varying covariates. The data comes from a longitudinal survey that is administered annually, and I have manipulated it to look like this:

id  event       end.time    income1      income2    income3     income4
1   1           3           8            10         13          8       
2   0           4           13           15         24          35

event indicates whether the event occurred or not, end.time is the time to event, and I have my time-varying covariates for each subsequent period to the right. So, for observation 1, the event occurred at year 3, and during year 1, they earned an income of 8 thousand dollars, etc. For observation 2, the event is censored, and we have data up to year 4 (when the study ends).

In the end, I'd like my data to look something like this:

id  st.time end.time    event   inc

1   0       1           0       8
1   1       2           0       10
1   2       3           1       13
2   0       1           0       13
2   1       2           0       15
2   2       3           0       24
2   3       4           0       35

I've looked up the tmerge() and SurvSplit() functions but am unsure of how to apply them in this specific situation. It seems that with SurvSplit(), I could use the cutpoints by year, but not sure how it would reshape the time-varying covariates.

It might be the case that using a generic reshape might work better?

Any advice would be appreciated.

Probably a general reshape along with some manipulation with dplyr would work.

library(dplyr)

df %>%
  tidyr::pivot_longer(cols = starts_with('income'), values_to = 'inc') %>%
  group_by(id) %>%
  slice(1:first(end.time)) %>%
  mutate(end.time = row_number(),
         st.time = end.time - 1,
         event = replace(event, -n(), 0)) %>%
  select(-name)


#     id event end.time   inc st.time
#  <int> <dbl>    <dbl> <int>   <dbl>
#1     1     0        1     8       0
#2     1     0        2    10       1
#3     1     1        3    13       2
#4     2     0        1    13       0
#5     2     0        2    15       1
#6     2     0        3    24       2
#7     2     0        4    35       3

data

df <- structure(list(id = 1:2, event = 1:0, end.time = 3:4, income1 = c(8L, 
13L), income2 = c(10L, 15L), income3 = c(13L, 24L), income4 = c(8L, 
35L)), class = "data.frame", row.names = c(NA, -2L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM