简体   繁体   English

从R中的模型拟合中获得未调整的值

[英]Getting unadjusted values from model fit in R

I want to have a table with adjusted and unadjusted (crude) values. 我想要一个具有调整和未调整(粗略)值的表。 It seems like a common problem but I can't figure out how to do it without re-running the regression for each variable. 看来这是一个常见问题,但如果不为每个变量重新运行回归分析,我将无法解决。

Here's how I get the adjusted values: 这是我如何获得调整后的值:

library("survival")
library("timereg")
data(sTRACE)

# Basic cox regression    
surv <- with(sTRACE, Surv(time/365,status==9))
fit1 <- coxph(surv~age+sex+diabetes+chf+vf, data=sTRACE)
x <- cbind(exp(coef(fit1)), exp(confint(fit1)))
test <- apply(x, 1, FUN=function(x){
    x <- round(x, 1)
    txt <- paste(x[1], " (95% CI ", min(x[2:3]), "-", max(x[2:3]), ")", sep="")
    return(txt)
})
test

The test variable is now a vector: 测试变量现在是一个向量:

> test
                   age                    sex               diabetes 
"1.1 (95% CI 1.1-1.1)" "1.4 (95% CI 1.1-1.9)"   "1.5 (95% CI 1-2.2)" 
                   chf                     vf 
"2.1 (95% CI 1.6-2.8)" "2.3 (95% CI 1.4-3.8)" 

I woul like to add this to a 2-dimensional matrix where one column is the unadjusted value something like this: 我想将其添加到二维矩阵中,其中一列是未经调整的值,如下所示:

  Adjusted Unadjusted age "1.1 (95% CI 1.1-1.1)" "1.1 (95% CI 1.1-1.1)" 

Where the unadjusted value is created like this: 未调整值的创建方式如下:

fit2 <- coxph(surv~age, data=sTRACE)
x <- cbind(exp(coef(fit2)), exp(confint(fit2)))
test <- apply(x, 1, FUN=function(x){
    x <- round(x, 1)
    txt <- paste(x[1], " (95% CI ", min(x[2:3]), "-", max(x[2:3]), ")", sep="")
    return(txt)
})
test

This could possibly be done by the update() function but I imagine there should be some automated process since this is common practice 这可能可以通过update()函数完成,但我想应该有一些自动化过程,因为这是常见的做法


UPDATE 更新

After some thinking and with some inspiration from the answer I wrote this function: 经过一番思考并从答案中得到一些启发,我编写了此函数:

print_adjusted_and_unadjusted <- function(fit, digits=2){
    # Just a prettifier for the output an alternative could be:
    # paste(round(x[,1],1), " (95% CI ", min(round(x[,2:3])), "-", max(round(x[,2:3])), ")", sep="")  
    get_coef_and_ci <- function(fit){
        # Just to make sure that it gives 1.0 and 
        # not 1 if digits = 1, in cases where a 
        # adding another decimal that is used
        # since everyone is so hyped about p-val < 0.05
        add_zero_to_var <- function(x){
            ret <- round(as.double(x), digits)
            if (x == 1){
                ret <- round(x, digits+1)
                if (ret == 1){
                    ret <- paste("1.", paste(rep("0", digits), collapse=""), sep="")
                }
            }else if(nchar(as.character(x)) < digits + 2){
                add_zeros <- digits + 2 - nchar(as.character(x))
                ret <- paste(x, paste(rep("0", add_zeros), collapse=""), sep="")
            }
            return(ret)
        }

        # Get coefficients and conf. interval
        my_coefficients <- coef(fit)
        ci <- confint(fit)

        # Use the exp() if logit or cox regression
        if ("coxph" %in% class(fit) ||
            ("glm" %in% class(fit) &&
            fit$family$link == "logit")){
            my_coefficients <- exp(my_coefficients)
            ci <- exp(ci)
        }

        if (length(my_coefficients) > 1){
            my_coefficients <- tapply(my_coefficients, 1:length(my_coefficients), FUN = add_zero_to_var)
        }else{
            my_coefficients <- add_zero_to_var(my_coefficients)
        }

        ci <- apply(ci, 1, FUN=function(x){
                ci <- round(x, digits)
                for(i in 1:2){
                    ci[i] <- add_zero_to_var(ci[i])
                }
                return(paste(ci[1], "-", ci[2], sep=""))
            })
        ret_val <- cbind(my_coefficients, ci)
        colnames(ret_val) <- c("", "2.5% - 97.5%")
        rownames(ret_val) <- names(coef(fit))
        return(ret_val)
    }

    # Extract all the term names
    all.terms <- terms(fit) 
    var_names <- attr(all.terms, 'term.labels')

    # Skip variables consisting of
    # functions such as spline, strata variables
    regex_for_unwanted_vars <- "^(strat[a]{0,1}|ns|rcs|bs|pspline)[(]"
    skip_variables <- grep(regex_for_unwanted_vars, var_names)

    # Get the adjusted variables
    adjusted <- get_coef_and_ci(fit)
    # When using splines, rcs in cox regression this shows a little different

    # Remove all the splines, rcs etc
    rn <- rownames(adjusted)
    remove_1 <- grep("(\'{1,}|[[][0-9]+[]]|[)][0-9]+)$", rn)
    remove_2 <- grep("^(strat[a]{0,1}|ns|rcs|bs)[(]", rn)
    adjusted <- adjusted[-union(remove_1, remove_2), ]
    if ("cph" %in% class(fit)){
        remove_3 <- grep("^rcs[(]", var_names)
        adjusted <- adjusted[-remove_3, ]
    }

    unadjusted <- c() 
    for(variable in var_names[-skip_variables]){
        interaction_variable <- length(grep(":", variable)) > 0

        # If it's an interaction variable the
        # interacting parts have to be included  
        if (interaction_variable){
            variable <- paste(paste(unlist(strsplit(variable, ":")), sep="", collapse=" + "), variable, sep=" + ")
        }

        # Run the same fit but with only one variable
        fit_only1 <- update(fit, paste(".~", variable))

        # Get the coefficients processed with some advanced 
        # round part()
        new_vars <- get_coef_and_ci(fit_only1)

        # If interaction then we should only keep the 
        # interaction part - the other variables are
        # always included by default and need therefore
        # to be removed
        if (interaction_variable){
            new_vars <- new_vars[grep("[*:]", rownames(new_vars)),]
        }

        # Add them to the previous
        unadjusted <- rbind(unadjusted, new_vars)
    }

    # If regression contains (Intercept) 
    # this is meaningless for the comparison
    # of adjusted and unadjusted values 
    if ("(Intercept)" %in% rownames(unadjusted)){
        unadjusted <- unadjusted[rownames(unadjusted) != "(Intercept)", ]
        unadjusted <- rbind(c("-", "-"), unadjusted)
        rownames(unadjusted)[1] <- "(Intercept)"
    }

    both <- cbind(unadjusted, adjusted)
    colnames(both) <- c("Unadjusted", "95% CI", "Adjusted", "95% CI")
    return(both)
}

It gives me a 4-dimentional array: 它给了我一个4维数组:

    Unadjusted 95% CI      Adjusted 95% CI     
age "0.74"     "0.68-0.81" "0.69"   "0.62-0.76"
....

I use this together with xtable (or latex() in Hmisc): 我将其与xtable(或Hmisc中的latex())一起使用:

xtable(print_adjusted_and_unadjusted(fit.oa.base.model), align="lrcrc")

I've tested it on lm(), cph() and coxph() and it seems to work. 我已经在lm(),cph()和coxph()上进行了测试,它似乎可以正常工作。

Thanks for your help and hope that this code comes to use for more than just me. 感谢您的帮助,并希望此代码能为我带来更多帮助。

First of all, given the x you achieve from your cbind , you don't need apply but can simply use vectorized code: 首先,给定您从cbind获得的x,您不需要apply而可以简单地使用矢量化代码:

test<-paste(round(x[,1],1), " (95% CI ", min(round(x[,2:3])), "-", max(round(x[,2:3])), ")", sep="")

This should yield the same result. 这应该产生相同的结果。

Now, if you want to run over some different variables, you will have to build your formula as a character (note: I'm assuming test is the result of the full model here, as per your code, so I can use the names): 现在,如果要运行一些不同的变量,则必须将公式构建为字符(注意:根据您的代码,我假设测试是此处完整模型的结果,因此我可以使用名称):

unadjusted<-sapply(names(test), function(curname){
  curfrm<-paste("surv", curname, sep="~")
  curfit<-coxph(as.formula(curfrm), data=sTRACE)
  curx <- cbind(exp(coef(fit1)), exp(confint(fit1)))
  paste(round(curx[,1],1), " (95% CI ", min(round(curx[,2:3])), "-", max(round(curx[,2:3])), ")", sep="")
})

Now you can just cbind test en unadjusted for your desired effect. 现在,您仅需按需unadjusted即可绑定test en即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM