简体   繁体   English

在R的生存分析中找到对数正态分布的平均值

[英]Finding the mean of the log-normal distribution in survival analysis in R

I am a novice with R. Currently I am fitting a log-normal distribution to some survival data I have, however I have become stuck when trying to calculate statistics such as the median and the mean. 我是R的新手。目前,我正在对一些现有的生存数据拟合对数正态分布,但是在尝试计算诸如中位数和均值之类的统计数据时,我陷入了困境。 This is the code I have used so far, can anyone tell me what I should type next to find the mean? 这是我到目前为止使用的代码,谁能告诉我接下来我应该键入什么来找到均值?

# rm(list=ls(all=TRUE))
library(survival)
data<-read.table("M:\\w2k\\Diss\\Hoyle And Henley True IPD with number at risk known.txt",header=T)
attach(data)
data
times_start <-c(  rep(start_time_censor, n_censors), rep(start_time_event, n_events) )
times_end <-c(  rep(end_time_censor, n_censors), rep(end_time_event, n_events)  )
model <- survreg(Surv(times_start, times_end, type="interval2")~1, dist="lognormal")
intercept <- summary(model)$table[1]   
log_scale <- summary(model)$table[2]

this is where I got stuck, I have tried: 这是我被卡住的地方,我尝试过:

meantime<-exp(intercept+log_scale/2)

but this does not seem to give a realistic mean. 但这似乎并不现实。

The place to look for a worked example is ?predict.survreg . 寻找?predict.survreg示例的地方是?predict.survreg (In general, using the help system for predict methods is a productive strategy for any regression method.) (通常,将帮助系统用于predict方法是任何回归方法的有效策略。)

Running the last example should give you enough basis to proceed. 运行最后一个示例应该为您提供足够的基础进行下一步。 In particular you should see that the regression coefficients are not estimates of survival times or quantiles. 特别是您应该看到回归系数不是生存时间或分位数的估计。

lfit <- survreg(Surv(time, status) ~ ph.ecog, data=lung)
pct <- 1:98/100   # The 100th percentile of predicted survival is at +infinity
ptime <- predict(lfit, newdata=data.frame(ph.ecog=2), type='quantile',
                 p=pct, se=TRUE)
matplot(cbind(ptime$fit, ptime$fit + 2*ptime$se.fit,
                          ptime$fit - 2*ptime$se.fit)/30.5, 1-pct,
         xlab="Months", ylab="Survival", type='l', lty=c(1,2,2), col=1)
 # The plot should be examined since you asked for a median survival time
 abline(h= 0.5)
 # You can  drop a vertical from the intersection to get that graphically 

.... or ... .... 要么 ...

 str(ptime)
List of 2
 $ fit   : num [1:98] 9.77 16.35 22.13 27.46 32.49 ...
 $ se.fit: num [1:98] 2.39 3.53 4.42 5.16 5.82 ...

You can extract the 50th percentile from that sequence of survival times with: 您可以使用以下方法从该生存时间序列中提取第50个百分点:

 ptime$fit[which((1-pct)==0.5)]
# [1] 221.6023   

Measured in days which was why Therneau divided by 30.5 to display months 以天为单位,这就是为什么Therneau除以30.5来显示月份

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM