简体   繁体   中英

How can I calculate survival function in gbm package analysis?

I would like to analysis my data based on the gradient boosted model.

On the other hand, as my data is a kind of cohort, I have a trouble understanding the result of this model.

Here's my code. Analysis was performed based on the example data.

install.packages("randomForestSRC")
install.packages("gbm")
install.packages("survival")

library(randomForestSRC)
library(gbm)
library(survival)

data(pbc, package="randomForestSRC")
data <- na.omit(pbc)

set.seed(9512)
train <- sample(1:nrow(data), round(nrow(data)*0.7))
data.train <- data[train, ]
data.test <- data[-train, ]

set.seed(9741)
gbm <- gbm(Surv(days, status)~.,
           data.train,
           interaction.depth=2,
           shrinkage=0.01,
           n.trees=500,
           distribution="coxph")

summary(gbm)


set.seed(9741)
gbm.pred <- predict.gbm(gbm, 
                        n.trees=500,
                        newdata=data.test, 
                        type="response") 

As I read the package documnet, "gbm.pred" is the result of cox's partial likelihood.

set.seed(9741)
lambda0 = basehaz.gbm(t=data.test$days, 
                      delta=data.test$status,  
                      t.eval=sort(data.test$days), 
                      cumulative = FALSE, 
                      f.x=gbm.pred, 
                      smooth=T)

hazard=lambda0*exp(gbm.pred)

In this code, lambda0 is a baseline hazard fuction.

So, according to formula: h(t/x)=lambda0(t)*exp(f(x))

"hazard" is hazard function.

However, what I've wanted to calculte was the "survival function".

Because, I would like to compare the outcome of original data (data$status) to the prediction result (survival function).

Please let me know how to calculate survival function.

Thank you

Actually, the returns is cumulative baseline hazard function(integral part: \\int^t\\lambda(z)dz ), and survival function can be computed as below:

s(t|X)=exp{-e^f(X)\\int^t\\lambda(z)dz}

f(X) is prediction of gbm , which is equal to log-hazard proportion.

I think this tutorial about gbm-based survival analysis would help to u!

https://github.com/liupei101/Tutorial-Machine-Learning-Based-Survival-Analysis/blob/master/Tutorial_Survival_GBM.ipynb

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM