简体   繁体   English

R生存survreg不能产生很好的适应

[英]R survival survreg not producing a good fit

I am new to using R, and I am trying to use survival analysis in order to find correlation in censored data. 我是使用R的新手,并且正在尝试使用生存分析来在审查的数据中找到相关性。 The x data is the envelope mass of protostars. x数据是原恒星的包络质量。 The y data is the intensity of an observed molecular line, and some values are upper limits. y数据是观察到的分子线的强度,有些值是上限。 The data is: 数据为:

x <- c(17.299, 4.309, 7.368, 29.382, 1.407, 3.404, 0.450, 0.815, 1.027, 0.549, 0.018)
y <- c(2.37, 0.91, 1.70, 1.97, 0.60, 1.45, 0.25, 0.16, 0.36, 0.88, 0.42)
censor <- c(0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1)

I am using the function survreg from the R Survival library 我正在使用R Survival库中的功能survreg

modeldata<-survreg(formula=Surv(y,censor)~x, dist="exponential", control = list(maxiter=90))

Which gives the following result: 得到以下结果:

summary(modeldata)

Call:
survreg(formula = Surv(y, censor) ~ x, dist = "exponential", 
control = list(maxiter = 90))
Value Std. Error     z     p
(Intercept) -0.114      0.568 -0.20 0.841
x            0.153      0.110  1.39 0.163

Scale fixed at 1 

Exponential distribution
Loglik(model)= -6.9   Loglik(intercept only)= -9
Chisq= 4.21 on 1 degrees of freedom, p= 0.04 
Number of Newton-Raphson Iterations: 5 
n= 11

However, when I plot the data and the model using the following method: 但是,当我使用以下方法绘制数据和模型时:

plot(x,y,pch=(censor+1))
xnew<-seq(0,30)
model<-predict(modeldata,list(x=xnew))
lines(xnew,model,col="red")

I get this plot of x and y data; 我得到这个x和y数据图; triangles are censored data 三角形是检查数据

I am not sure where I am going wrong. 我不确定我要去哪里错。 I have tried different distributions, but all produce similar results. 我尝试了不同的发行版,但是都产生了相似的结果。 The same is true when I use other data, for example: 当我使用其他数据时也是如此,例如:

x <- c(1.14, 1.14, 1.19, 0.78, 0.43, 0.24, 0.19, 0.16, 0.17, 0.66, 0.40)

I am also not sure if I am interpreting the results correctly. 我也不确定我是否正确解释了结果。

I have tried other examples using the same method (eg https://stats.idre.ucla.edu/r/examples/asa/r-applied-survival-analysis-ch-1/ ), and it works well, as far as I can tell. 我尝试了使用相同方法的其他示例(例如https://stats.idre.ucla.edu/r/examples/asa/r-applied-survival-analysis-ch-1/ ),到目前为止效果很好据我所知。

So my questions are: 所以我的问题是:

  1. Am I using the correct function for fitting the data? 我是否使用正确的函数拟合数据? If not, which would be more suitable? 如果没有,哪个更合适?

  2. If it is the correct function, why is the model not fitting the data even closely? 如果它是正确的函数,那么为什么模型无法更紧密地拟合数据? Does it have to do with the plotting? 它与绘图有关吗?

Thank you for your help. 谢谢您的帮助。

The "shape" of the relationship looks concave downward, so I would have guessed a ~ log(x) fit might be be more appropriate: 关系的“形状”看起来朝下凹,所以我猜想〜log ~ log(x)拟合可能更合适:

dfrm <- data.frame( x = c(17.299, 4.309, 7.368, 29.382, 1.407, 3.404, 0.450, 0.815, 1.027, 0.549, 0.018),
y = c(2.37, 0.91, 1.70, 1.97, 0.60, 1.45, 0.25, 0.16, 0.36, 0.88, 0.42),
censor= c(0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1))

modeldata<-survreg(formula=Surv(y,censor)~log(x), data=dfrm, dist="loggaussian", control = list(maxiter=90))

You code seemed appropriate: 您的代码似乎合适:

png(); plot(y~x,pch=(censor+1),data=dfrm)
xnew<-seq(0,30)
model<-predict(modeldata,list(x=xnew))
lines(xnew,model,col="red"); dev.off()

在此处输入图片说明

modeldata
Call:
survreg(formula = Surv(y, censor) ~ log(x), data = dfrm, dist = "loggaussian", 
    control = list(maxiter = 90))

Coefficients:
(Intercept)      log(x) 
 0.02092589  0.32536509 

Scale= 0.7861798 

Loglik(model)= -6.6   Loglik(intercept only)= -8.8
    Chisq= 4.31 on 1 degrees of freedom, p= 0.038 
n= 11 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM