生存分析中的生存时间

Question

I'm having problems to analyze a survival dataset that I have.我在分析我拥有的生存数据集时遇到问题。 I will put the dput output of the dataset in a github link to not pollute the question.我会将数据集的dput输出放在 github 链接中，以免污染问题。

Here is the data https://gist.github.com/anonymous/4fdff1c6d0853c41939e2a67d9e0e45b这是数据https://gist.github.com/anonymous/4fdff1c6d0853c41939e2a67d9e0e45b

In this dataset, I want to make plot of survival curves for each group, so I need to make a survfit() model.在这个数据集中，我想为每个组绘制生存曲线图，所以我需要制作一个survfit()模型。

The variables W1,W2,..,W43 represents weeks and the numbers represents some measure.变量 W1,W2,..,W43 代表周数，数字代表某种度量。 When I have a dot .当我有一个点. in any week, it means that individual died that week, and consequently every week that follows are flagged with dot .在任何一周中，这意味着该个人在该周死亡，因此接下来的每个星期都带有 dot 标记.

In a survival model this death represents an event (failure) and if the individual survival all the weeks he represents a censored data.在生存模型中，这种死亡代表一个事件（失败），如果个人在所有周内都存活下来，则他代表一个经过审查的数据。

To make a survival model the way that I know I need to have a data like this below要以我知道的方式制作生存模型，我需要有如下数据

time=c(3,4,8,8,5,2)
event=c(1,1,0,0,1,1)

in this case time represents the time of death in weeks and event is 1 if death and 0 if censored.在这种情况下，时间表示以周为单位的死亡时间，如果死亡，则事件为 1，如果审查则为 0。

EDIT: I thinked in one possible solution, but I don't know how I can do it.编辑：我想到了一种可能的解决方案，但我不知道该怎么做。 The idea is below思路如下

1) Take all the columns W1,W2,...,W43 and put 1 if its a number and put 0 if it is a dot . 1) 取 W1,W2,...,W43 的所有列，如果是数字则输入 1，如果是点则输入 0 .

2) Create a new variable that represents time and the value of this variable will be the sum of columns W1 to W43, so it will W1+W2+...+W43. 2) 创建一个代表时间的新变量，这个变量的值是W1到W43列的总和，所以它是W1+W2+...+W43。

3) Create a new variable that represents the event, then if time=43 this means that the individual survived all the time then it will be 0 (censored) and if if is less than 43 it means that the individual died, then the variable will be 1. 3）创建一个代表事件的新变量，那么如果time=43这意味着个体一直存活，那么它将为0（删失），如果小于43意味着个体死亡，那么变量将是 1。

Anyone can help me to do it?任何人都可以帮我做吗？

Answer 1

I named your dataset sdat and these operations add the two additional columns:我将您的数据集命名为 sdat，这些操作添加了两个额外的列：

sdat$time= apply(sdat[ ,grepl("W", names(sdat))], 1 , #work by rows on "W"-columns
                    function(r) which( r==".")[1] )  # seq-number of first "."
sdat$event <- as.numeric( !is.na(sdat$time) ) # convert NA's to logical and to 1,0
sdat$time= ifelse( is.na(sdat$time) , 43, sdat$time) # set time to 43 for survivors

 # Check results
 head( sdat[ , !grepl("W", names(sdat))] ) # remove "W" cols
  Group Ref Sex  M1   M2 M3  M4 time event
1    11   4   1 959 1940 10 184   23     1
2    11   4   1 960 1770 10 189   31     1
3    11   4   1 961 1970 10 166   23     1
4    11   4   1 962 1870  1 180   43     0
5    11   4   1 964 1780 11 239   43     0
6    12   4   1 966 1980 11 182   43     1

As an analyst I would be asking what meaning to attach to the varying "W"-numbers leading up to the deaths, but that was not your question.作为一名分析师，我会问导致死亡的不同“W”数字有什么意义，但这不是你的问题。

生存分析中的生存时间

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-09-30 16:49:03

生存分析中的生存时间

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-09-30 16:49:03

解决方案1
1 已采纳 2016-09-30 16:49:03