I have performed multiple imputation using the r mice package (m = 50, method = "pmm")
on a dataset of randomised controlled trial (RCT) data. I have imputed data for a continuous variable that I am using as a dependent variable in a linear regression model. Prior to imputation, this variable was a factor with two levels representing the two trial arms of the RCT. However, the mice package would not let me input factor variables into imputation so I converted it to a character using the as.character function.
When running the regression on the imputed data, I would like the switch the reference level for the continuous variable, however am unsure how to do this.
My code for the imputation is below:
#impute missing data for wellbeing1yr using MICE package
peermentor$intervention <- as.character(peermentor$intervention)
peermentor$injecting_status <- as.character(peermentor$injecting_status)
imputed_Data <- mice(peermentor, m=50, method = "pmm")
imputefit <- with(data = imputed_Data, expr = lm(wellbeing1yr ~
intervention))
combine <- pool(imputefit)
summary(combine)
The two levels for the "intervention" variable are "peer mentoring" and "standard of care". I would like "standard of care" as the reference level.
First, you seem not to have specified a predictorMatrix
, which means that missing values in whichever covariate in your data.frame
will be predicted by all the others, which I would advise against (unless you have clear reasons to do so).
Redefining the levels of your intervention variable can be done after you have completed the multiple imputation. Use miceadds::mids2datlist(imputed_Data)
to convert the mids
object to a list of multiply imputed data sets. You can use this list to make adjustments to covariates and then convert it back to a mids
with miceadds::datlist2mids
.
Since you have not provided example data yet, I will show how this is done via this reprex
.
library(mice)
library(miceadds)
# multiply impute missing values
imp <- mice(nhanes, print = FALSE, seed = 123, maxit = 20, m = 10)
# convert mids to datlist
longlist <- complete(imp, 'long')
# make new factor covariate based on values of chl
longlist <- lapply(split(longlist, longlist$.imp), \(x) {
cbind(x, chl_new = factor(ifelse(x$chl < 185, 1, ifelse(x$chl >= 185 & x$chl < 191.4, 2, 3))))
})
imp_update <- datlist2mids(longlist)
fit <- with(imp_update, lm(age ~ bmi + chl_new))
> summary(pool(fit))
term estimate std.error statistic df p.value
1 (Intercept) 4.2473138 1.24133838 3.4215600 9.298439 0.007254247
2 bmi -0.1115735 0.05033307 -2.2167042 8.278105 0.056377780
3 chl_new2 0.3181516 0.45055559 0.7061317 15.177053 0.490804764
4 chl_new3 0.8243235 0.46964596 1.7552020 11.276793 0.106319297
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.