[英]Transform data to long format in R given survival time
Consider the following sample dataset.考虑以下示例数据集。
*id represents an individual's identifier. *id 代表个人的标识符。
*Surv_time represents an individual's survival time *Surv_time 代表一个人的生存时间
*start represents the time at which zj is measured. *start 表示测量 zj 的时间。 zj is a time-varying covariate.
zj 是时变协变量。
rm(list=ls()); set.seed(1)
n<-5
Surv_time<-round( runif( n, 12 , 20 ) ) #Survival time
dat<-data.frame(id=1:n, Surv_time )
ntp<- rep(3, n) # three measurements per individual.
mat<-matrix(ncol=2,nrow=1)
m=0; w <- mat
for(l in ntp)
{
m=m+1
ft<- seq(from = runif(1,0,8), to = runif(1,12,20) , length.out = l)
seq<-round(ft)
matid<-cbind( matrix(seq,ncol=1 ) ,m)
w<-rbind(w,matid)
}
d<-data.frame(w[-1,])
colnames(d)<-c("start","id")
D <- merge(d,dat,by="id") #merging dataset
D$zj <- with(D, 0.3*start)
D
id start Surv_time zj
1 1 7 14 2.1
2 1 13 14 3.9
3 1 20 14 6.0
4 2 5 15 1.5
5 2 11 15 3.3
6 2 17 15 5.1
7 3 0 17 0.0
8 3 7 17 2.1
9 3 14 17 4.2
10 4 1 19 0.3
11 4 9 19 2.7
12 4 17 19 5.1
13 5 3 14 0.9
14 5 11 14 3.3
15 5 18 14 5.4
I need a code to transform the data to the start-stop format where the last stop is at Surv_time for an individual.我需要一个代码来将数据转换为开始-停止格式,其中最后一站是个人的 Surv_time。 The idea is to create start-stop intervals where the stop of an interval is the start of the next interval.
这个想法是创建开始 - 停止间隔,其中间隔的停止是下一个间隔的开始。 I should end up with
我应该结束
id start stop Surv_time zj
1 1 7 13 14 2.1
2 1 13 14 14 3.9
4 2 5 11 15 1.5
5 2 11 15 15 3.3
7 3 0 7 17 0.0
8 3 7 14 17 2.1
9 3 14 17 17 4.2
10 4 1 9 19 0.3
11 4 9 17 19 2.7
12 4 17 19 19 5.1
13 5 3 11 14 0.9
14 5 11 14 14 3.3
This might not be the most elegant solution, but it should work这可能不是最优雅的解决方案,但它应该有效
library(tidyverse)
D <- D %>%
mutate(stop = c(start[2:nrow(D)],NA)) %>%
filter(start<=Surv_time)
D$stop[D$stop > D$Surv_time |D$stop < D$start] <- D$Surv_time[D$stop > D$Surv_time|D$stop < D$start]
D <- D %>% select(id, start, stop, Surv_time, zj)
We can use dplyr:我们可以使用 dplyr:
data %>% group_by(id) %>%
mutate(stop = lead(start, default = Inf),
stop = ifelse(stop > Surv_time, Surv_time, stop))%>%
filter(stop<= Surv_time) %>%
ungroup()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.