简体   繁体   中英

Adding time varying covariates to survival data using 'tmerge' in 'survival' package

I'm trying to add several time dependent covariates to a dataset for survival analysis using tmerge from the survival package. I mean to add each sequentially, as recommended in the vignette on the subject, but the output from the first addition does not work as I intended.

More specifically, I have one simple data.frame with the ids of the individual (organizations) and the number of days (age) until the organization ceases activities. The second data.frame has the ids and the number of days until the organization experiences a "transition" event. Not all organizations experience a transition, so not all organizations are present in the second data.frame.

In the first call to tmerge I format the first data.frame in the format the package uses. In the second I try to add a variable that counts the number of transitions an organization has experienced. For most organizations, the result is as I expect, but for a small number the result does not make sense and there is no obvious reason to me why it fails.

The data.frames are small, so I post them along with the code below.

ages <- structure(list(id = c(1L, 2L, 5L, 6L, 9L, 10L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 20L, 21L, 24L, 26L, 27L, 28L, 29L, 30L, 31L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 42L, 45L, 46L, 43L, 48L, 49L, 50L, 51L, 52L, 54L, 55L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 8L, 19L, 22L, 23L, 33L, 41L), age = c(13668, 21550, 15249, 21550, 16045, 21550, 14976, 14976, 6574, 21550, 4463, 16927, 16927, 15706, 4567, 21306, 17235, 22158, 19692, 17632, 17597, 4383, 5811, 7704, 5063, 17351, 17015, 16801, 4383, 5080, 13185, 12604, 19784, 5310, 15369, 13239, 1638, 21323, 10914, 21262, 7297, 17214, 17508, 14199, 14062, 2227, 8434, 4593, 14429, 21323, 4782, 10813, 2667, 2853, 5709, 3140, 12237, 7882, 21550, 15553, 16466, 16621, 19534, 21842)), .Names = c("id", "age"), row.names = c(NA, 64L), class = "data.frame")
ages1 <- tmerge(ages, ages, id=id, tstop=age)
transitions <- structure(list(id = c(2L, 2L, 6L, 8L, 10L, 19L, 22L, 23L, 24L, 31L, 33L, 41L, 43L, 43L, 52L, 55L, 66L), transition = structure(c(18993, 13668, 15249, 15706, 15887, 11609, 4023, 9316, 16193, 1461, 4584, 17824, 3713, 11261, 16818, 10670, 15479), class = "difftime", units = "days")), .Names = c("id", "transition"), row.names = c(3L, 4L, 7L, 8L, 11L, 20L, 25L, 27L, 28L, 35L, 38L, 47L, 49L, 51L, 59L, 61L, 73L), class = "data.frame")
newdata <- tmerge(ages1, transitions, id=id, transition=cumtdc(transition))

As an example of one that fails, consider id=22. It experiences one transition after 4023 days. So, tmerge should create two new rows with id=22: one for 0 to 4023 and one for 4023 to 16466 (the age the organization 'dies'). Both of these are created, but so is a third unnecessary row for id=22 with a start of 0 and a stop of 16466.

There are 17 transitions spread across the 64 organizations and I count 3 errors like the one above and cannot figure out what sets these 3 apart from the remaining (successful) cases. I could easily fix these 3 but as other TVCs are added, the time cost of detecting and fixing such errors will rise exponentially. Any ideas about what I'm missing?

The problem is solved with a simple sort by id. ages1 <- ages1[order(ages1$id),] . The package creator provided this solution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM