简体   繁体   English

R与Stata中的Cox比例风险模型

[英]Cox proportional hazard model in R vs Stata

I´m trying to replicate in R a cox proportional hazard model estimation from Stata using the following data http://iojournal.org/wp-content/uploads/2015/05/FortnaReplicationData.dta 我试图使用以下数据在R中复制来自Stata的cox比例风险模型估计http://iojournal.org/wp-content/uploads/2015/05/FortnaReplicationData.dta

The command in stata is the following: stata中的命令如下:

stset enddate2009, id(VPFid) fail(warends) origin(time startdate)
stcox HCTrebels o_rebstrength demdum independenceC transformC lnpop lngdppc africa diffreligion warage if keepobs==1, cluster(js_country)

Cox regression -- Breslow method for ties

No. of subjects      =          104                Number of obs   =       566
No. of failures      =           86
Time at risk         =       194190
                                               Wald chi2(10)   =     56.29
Log pseudolikelihood =   -261.94776                Prob > chi2     =    0.0000

                           (Std. Err. adjusted for 49 clusters in js_countryid)
-------------------------------------------------------------------------------
              |               Robust
           _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
    HCTrebels |   .4089758   .1299916    -2.81   0.005     .2193542    .7625165
o_rebstrength |   1.157554   .2267867     0.75   0.455     .7884508    1.699447
       demdum |   .5893352   .2353317    -1.32   0.185     .2694405    1.289027
independenceC |   .5348951   .1882826    -1.78   0.075      .268316    1.066328
   transformC |   .5277051   .1509665    -2.23   0.025     .3012164    .9244938
        lnpop |   .9374204   .0902072    -0.67   0.502     .7762899    1.131996
      lngdppc |   .9158258   .1727694    -0.47   0.641     .6327538    1.325534
       africa |   .5707749   .1671118    -1.92   0.055     .3215508    1.013165
 diffreligion |   1.537959   .4472004     1.48   0.139      .869834    2.719275
       warage |   .9632408   .0290124    -1.24   0.214     .9080233    1.021816
-------------------------------------------------------------------------------

With R, I´m using the following: 使用R,我使用以下内容:

data <- read.dta("FortnaReplicationData.dta")
data4 <- subset(data, keepobs==1)
data4$end_date <- data4$`_t`
data4$start_date <- data4$`_t0`
levels(data4$o_rebstrength) <- c(0:4)
data4$o_rebstrength <- as.numeric(levels(data4$o_rebstrength[data4$o_rebstrength])
data4 <- data4[,c("start_date", "end_date","HCTrebels",  "o_rebstrength", "demdum", "independenceC", "transformC", "lnpop", "lngdppc", "africa", "diffreligion", "warage", "js_countryid", "warends")]
data4 <- na.omit(data4)
surv <- coxph(Surv(start_date, end_date, warends) ~ HCTrebels+ o_rebstrength +demdum + independenceC+ transformC+ lnpop+ lngdppc+ africa +diffreligion+ warage+cluster(js_countryid), data = data4, robust = TRUE, method="breslow")

                 coef exp(coef) se(coef) robust se     z      p
HCTrebels     -0.8941    0.4090   0.3694    0.3146 -2.84 0.0045
o_rebstrength  0.1463    1.1576   0.2214    0.1939  0.75 0.4505
demdum        -0.5288    0.5893   0.4123    0.3952 -1.34 0.1809
independenceC -0.6257    0.5349   0.3328    0.3484 -1.80 0.0725
transformC    -0.6392    0.5277   0.3384    0.2831 -2.26 0.0240
lnpop         -0.0646    0.9374   0.1185    0.0952 -0.68 0.4974
lngdppc       -0.0879    0.9158   0.2060    0.1867 -0.47 0.6377
africa        -0.5608    0.5708   0.3024    0.2898 -1.94 0.0530
diffreligion   0.4305    1.5380   0.3345    0.2878  1.50 0.1347
warage        -0.0375    0.9632   0.0405    0.0298 -1.26 0.2090

Likelihood ratio test=30.1  on 10 df, p=0.000827
n= 566, number of events= 86 

I get the same hazard ratio coefficients but the standard errors does not look the same. 我得到相同的风险比系数,但标准误差看起来不一样。 The Z and p values are close but not exactly the same. Z和p值接近但不完全相同。 Why might be the difference between the results in R and Stata? 为什么R和Stata的结果可能存在差异?

As user20650 noticed, when including "nohr" in the Stata options you get exactly the same standard errors as in R. Still there was a small difference in the standard errors when using clusters. 正如user20650注意到的那样,当在Stata选项中包含“nohr”时,你会得到与R中完全相同的标准错误。使用集群时,标准错误仍然存​​在细微差别。 user20650 again noticed that the difference was given because Stata default standard errors are multiplied g/(g − 1), where g is the number of cluster while R does not adjust these standard errors. user20650再次注意到给出了差异,因为Stata默认标准误差乘以g /(g-1),其中g是簇的数量,而R不调整这些标准误差。 So a solution is just to include noadjust in Stata or have the standard errors adjusted in R by doing: 因此,解决方案只是在Stata中包含noadjust或通过执行以下操作在R中调整标准错误:

sqrt(diag(vcov(surv))* (49/48))

If still we want in R to have the same standard errors from Stata, as when not specifying nohr, we need to know that when nhr is left off we obtain $exp(\\beta)$ with the standard errors resulting from fitting the model in those scale. 如果我们仍然希望R中的Stata具有相同的标准误差,那么当没有指定nohr时,我们需要知道当nhr不使用时,我们获得$ exp(\\ beta)$,其中包含因拟合模型而产生的标准误差那些规模。 In particular obtained by applying the delta method to the original standard-error estimate. 特别是通过将delta方法应用于原始标准误差估计而获得。 "The delta method obtains the standard error of a transformed variable by calculating the variance of the corresponding first-order Taylor expansion, which for the transform $exp(\\beta)$ amounts to mutiplying the oringal standard error by $exp(\\hat{\\beta})$. This trick of calculation yields identical rsults as does transforming the parameters prior to estimation and then reestimating" (Cleves et al 2010). “delta方法通过计算相应的一阶泰勒展开的方差来获得变换变量的标准误差,对于变换$ exp(\\ beta)$等于将oringal标准误差多达$ exp(\\ hat { \\ beta})$。这个计算技巧产生的相同结果与在估计之前转换参数然后重新估计“(Cleves et al 2010)相同。 In R we can do it by using: 在R中我们可以通过使用:

library(msm)
se <-diag(vcov(surv)* (49/48))
sapply(se, function(x) deltamethod(~ exp(x1), coef(surv)[which(se==x)], x))

     HCTrebels o_rebstrength    demdum independenceC transformC     lnpop   lngdppc    africa diffreligion     warage
     0.1299916     0.2267867 0.2353317     0.1882826  0.1509665 0.0902072 0.1727694 0.1671118    0.4472004 0.02901243

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM