简体   繁体   English

给定生存时间,在 R 中将数据转换为长格式

[英]Transform data to long format in R given survival time

Consider the following sample dataset.考虑以下示例数据集。

*id represents an individual's identifier. *id 代表个人的标识符。

*Surv_time represents an individual's survival time *Surv_time 代表一个人的生存时间

*start represents the time at which zj is measured. *start 表示测量 zj 的时间。 zj is a time-varying covariate. zj 是时变协变量。

rm(list=ls()); set.seed(1)
n<-5
Surv_time<-round( runif( n, 12 , 20  ) ) #Survival time
dat<-data.frame(id=1:n, Surv_time )
ntp<- rep(3, n) # three measurements per individual. 
mat<-matrix(ncol=2,nrow=1)
m=0; w <- mat
for(l in ntp)
{
  m=m+1
  ft<- seq(from = runif(1,0,8), to =  runif(1,12,20)  , length.out = l)
  seq<-round(ft)
  matid<-cbind( matrix(seq,ncol=1 ) ,m)
  w<-rbind(w,matid)
}

d<-data.frame(w[-1,])
colnames(d)<-c("start","id")
D <-  merge(d,dat,by="id") #merging dataset
D$zj <- with(D, 0.3*start)
D
   id start Surv_time  zj
1   1     7        14 2.1
2   1    13        14 3.9
3   1    20        14 6.0
4   2     5        15 1.5
5   2    11        15 3.3
6   2    17        15 5.1
7   3     0        17 0.0
8   3     7        17 2.1
9   3    14        17 4.2
10  4     1        19 0.3
11  4     9        19 2.7
12  4    17        19 5.1
13  5     3        14 0.9
14  5    11        14 3.3
15  5    18        14 5.4

I need a code to transform the data to the start-stop format where the last stop is at Surv_time for an individual.我需要一个代码来将数据转换为开始-停止格式,其中最后一站是个人的 Surv_time。 The idea is to create start-stop intervals where the stop of an interval is the start of the next interval.这个想法是创建开始 - 停止间隔,其中间隔的停止是下一个间隔的开始。 I should end up with我应该结束

  id start    stop  Surv_time  zj 
1   1     7    13     14       2.1    
2   1    13    14     14       3.9   

4   2     5    11     15       1.5    
5   2    11    15     15       3.3   

7   3     0    7      17       0.0    
8   3     7    14     17       2.1    
9   3    14    17     17       4.2   

10  4     1    9      19       0.3    
11  4     9    17     19       2.7    
12  4    17    19     19       5.1   

13  5     3    11     14       0.9    
14  5    11    14     14       3.3   

This might not be the most elegant solution, but it should work这可能不是最优雅的解决方案,但它应该有效

library(tidyverse)

D <- D %>% 
  mutate(stop = c(start[2:nrow(D)],NA)) %>% 
  filter(start<=Surv_time)

D$stop[D$stop > D$Surv_time |D$stop < D$start] <- D$Surv_time[D$stop > D$Surv_time|D$stop < D$start]

D <- D %>% select(id, start, stop, Surv_time, zj)

We can use dplyr:我们可以使用 dplyr:

data %>% group_by(id) %>%
         mutate(stop = lead(start, default = Inf),
                stop = ifelse(stop > Surv_time, Surv_time, stop))%>%
         filter(stop<= Surv_time) %>%
         ungroup()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 R 中按用户名和时间合并以获取长格式数据 - merge by username and time in R for long format data 从 R 中的长转换为宽格式 - Transform to wide format from long in R 以时间相关的生存格式格式化重复的日期数据 - Formatting repeated date data in time dependent survival format R:以宽格式向数据添加信息并将其转换为长格式的最佳实践 - R: Best practice to add information to data in wide format and transform it to long format 如何在 R 中将分层数据(从超过 20/20)从部分宽格式转换为长格式? - How to transform hierarchical data (from Beyond 20/20) from a partially wide format to long format in R? 如何使用R生成具有时间相关协变量的生存数据 - How to generate survival data with time dependent covariates using R 在r中将具有时间变量的纵向数据从宽格式转换为长格式 - Convert longitudinal data with time variables from wide to long format in r 使用r将给定参与者对类似问题的答案捕获在不同列中的答案,将广泛的调查数据转换为长数据? - Using r, transform wide survey data into long, given participants' answers to similar questions were captured in different columns? R:将无序长数据转换为宽数据 - R: Transform unordered long data to wide data 将数据转换为开始/停止/长格式 - Transform data to start-stop / long format
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM