简体   繁体   中英

Is it possible to relevel a variable within a multiply imputed dataset (created using mice)?

I have performed multiple imputation using the r mice package (m = 50, method = "pmm") on a dataset of randomised controlled trial (RCT) data. I have imputed data for a continuous variable that I am using as a dependent variable in a linear regression model. Prior to imputation, this variable was a factor with two levels representing the two trial arms of the RCT. However, the mice package would not let me input factor variables into imputation so I converted it to a character using the as.character function.

When running the regression on the imputed data, I would like the switch the reference level for the continuous variable, however am unsure how to do this.

My code for the imputation is below:

#impute missing data for wellbeing1yr using MICE package
peermentor$intervention <- as.character(peermentor$intervention)
peermentor$injecting_status <- as.character(peermentor$injecting_status)
imputed_Data <- mice(peermentor, m=50, method = "pmm")
imputefit <- with(data = imputed_Data, expr = lm(wellbeing1yr ~ 
intervention)) 
combine <- pool(imputefit) 
summary(combine)

The two levels for the "intervention" variable are "peer mentoring" and "standard of care". I would like "standard of care" as the reference level.

First, you seem not to have specified a predictorMatrix , which means that missing values in whichever covariate in your data.frame will be predicted by all the others, which I would advise against (unless you have clear reasons to do so).

Redefining the levels of your intervention variable can be done after you have completed the multiple imputation. Use miceadds::mids2datlist(imputed_Data) to convert the mids object to a list of multiply imputed data sets. You can use this list to make adjustments to covariates and then convert it back to a mids with miceadds::datlist2mids .

Since you have not provided example data yet, I will show how this is done via this reprex .

library(mice)
library(miceadds)

# multiply impute missing values
imp <- mice(nhanes, print = FALSE, seed = 123, maxit = 20, m = 10)

# convert mids to datlist
longlist <- complete(imp, 'long')

# make new factor covariate based on values of chl
longlist <- lapply(split(longlist, longlist$.imp), \(x) {
  cbind(x, chl_new = factor(ifelse(x$chl < 185, 1, ifelse(x$chl >= 185 & x$chl < 191.4, 2, 3))))
})

imp_update <- datlist2mids(longlist)

fit <- with(imp_update, lm(age ~ bmi + chl_new))
> summary(pool(fit))
         term   estimate  std.error  statistic        df     p.value
1 (Intercept)  4.2473138 1.24133838  3.4215600  9.298439 0.007254247
2         bmi -0.1115735 0.05033307 -2.2167042  8.278105 0.056377780
3    chl_new2  0.3181516 0.45055559  0.7061317 15.177053 0.490804764
4    chl_new3  0.8243235 0.46964596  1.7552020 11.276793 0.106319297

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM