简体   繁体   中英

newdata when plotting survival curve of a cox regression in r

I am trying to plot an adjusted survival curve by plotting a cox regression when using variable interactions.

reading the survfit.coxph page https://stat.ethz.ch/R-manual/R-devel/library/survival/html/survfit.coxph.html I see the parameter "newdata"

newdata 
a data frame with the same variable names as those that appear in the coxph formula. It is also valid to use a vector, if the data frame would consist of a single row.

The curve(s) produced will be representative of a cohort whose covariates correspond to the values in newdata. Default is the mean of the covariates used in the coxph fit.

Where I want to plot lines in my cox output that are interactions. Ie if my cox output looks like:

                    coef exp(coef) se(coef)      z       p
 Drug2           -0.1345     0.876   0.1812 -0.732 4.5e-01
 Drug3           -0.3678     0.719   0.0816 -3.966 7.2e-05
 Drug4            0.0468     1.063   0.0432  0.932 3.4e-01
 Sex              0.2574     1.294   0.0786  3.133 1.2e-03
 Sex:Drug2       -0.1283     0.880   0.1809 -0.709 4.8e-01
 Sex:Drug3       -0.3226     0.724   0.0817 -3.950 7.8e-05
 Sex:Drug4        0.0524     1.054   0.0574  0.913 3.6e-01

I want to plot the new survival curves for my Drug variable after interaction with Sex .

Which leads me to this newdata parameter.

What is the difference in not including newdata and just using the mean of the covariates, compared to calling newdata. At this point I don't even know how to build newdata correctly.

If anyone can give me any pointers on how I would build newdata based on my cox model, and what is it's significance compared to just using the mean. I should be expecting the same amount of lines in my original survival curve when plotting this new survival plot based of the cox data.

You will still have the adjusted mean survival as the implicit "baseline survival curve" but the curves based on newdata will have their hazard ratios offset from 1.0 by a factor of exp(coef). You put in values that represent features for which you want estimates, and the expand.grid function will create all the 2way combinations of covariates. It's not clear how you have Sex modeled but it appears from the output that it is as a numeric rather than a factor and I will assume that there is a one unit difference. Try:

plot( survfit( my.fit, newdata=expand.grid(Sex=c(1,2), drug=factor(1:4) ) ) )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM