简体   繁体   English

使用randomForestSRC在特定时间点生存的概率

[英]probability of survival at particular time points using randomForestSRC

I'm using rfsrc to model a survival problem, like this: 我正在使用rfsrc来建模生存问题,如下所示:

library(OIsurv)
library(survival)
library(randomForestSRC)

data(burn)
attach(burn)

library(randomForestSRC)

fit <- rfsrc(Surv(T1, D1) ~  ., data=burn)

# predict on the train set
pred <- predict(fit, burn, OOB=TRUE, type=response)
pred$predicted

this gives me the overall survival probability of all patients. 这给了我所有患者的总生存概率。

How do I get the survival probability for each person for different timepoints, say 0-5 months or 0-10 months? 如何获得不同时间点(例如0-5个月或0-10个月)的每个人的生存概率?

The documentation on this isn't immediately obvious if you aren't familiar with the package, but it is possible. 如果您不熟悉该软件包,那么这个文档并不是很明显,但它是可能的。

Load data 加载数据

data(pbc, package = "randomForestSRC")

Create trial and test datasets 创建试验和测试数据集

pbc.trial <- pbc %>% filter(!is.na(treatment))
pbc.test <- pbc %>% filter(is.na(treatment))

Build our model 建立我们的模型

rfsrc_pbc <- rfsrc(Surv(days, status) ~ .,
                   data = pbc.trial,
                   na.action = "na.impute")

Test out model 测试模型

test.pred.rfsrc <- predict(rfsrc_pbc, 
                           pbc.test,
                           na.action="na.impute")

All of the good stuff is held within our prediction object. 所有好东西都保存在我们的预测对象中。 The $survival object is a matrix of n rows (1 per patient) and n columns (one per time.interest - these are automatically chosen though you can constrain them using the ntime argument. Our matrix is 106x122) $survival对象是N行(1位病人)和N列的矩阵(每一个time.interest -这些都是自动选择,虽然你可以使用约束它们ntime 。我们的说法矩阵是106x122)

test.pred.rfsrc$survival

The $time.interest object is a list of the different "time.interests" (122, same as the number of columns in our matrix from $surival ) $time.interest对象是不同的“time.interests”的列表(122,相同的列中的从矩阵的数量$surival

test.pred.rfsrc$time.interest

Let's say we wanted to see our predicted status at 5 years, we would 假设我们希望看到我们5年的预测状态,我们会
need to figure out which time interest was closest to 1825 days (since our measurement period is days) when we look at our $time.interest object, we see that row 83 = 1827 days or roughly 5 years. 需要弄清楚哪个时间最接近1825天(因为我们的测量时间是天),当我们查看$time.interest对象时,我们看到第83行= 1827天或大约5年。 row 83 in $time.interest corresponds to column 83 in our $survival matrix. $time.interest第83行对应于$survival矩阵中的第83列。 Thus to see the predicted probability of survival at 5 years we would just look at column 83 of our matrix. 因此,为了看到预测的5年生存概率,我们只看一下矩阵的第83列。

test.pred.rfsrc$survival[,83]

You could then do this for whichever timepoints you're interested in. 然后,您可以针对您感兴趣的任何时间点执行此操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM