简体   繁体   中英

How to run a linear regression using lm() on a subset in R after multiple imputation using MICE

I want to run a linear regression analysis on my multiple imputed data. I imputed my dataset using mice. The formula I used to run a linear regression on my whole imputed set is as follows:

 mod1 <-with(imp, lm(outc ~ age + sex))
 pool_mod1 <- pool(mod1)
 summary(pool_mod1)

This works fine. Now I want to create a subset of BMI, by saying: I want to apply this regression analysis to the group of people with a BMI below 30 and to the group of people with a BMI above or equal to 30. I tried to do the following:

 mod2 <-with(imp, lm(outc ~ age + sex), subset=(bmi<30))
 pool_mod2 <- pool(mod2)
 summary(pool_mod2)

 mod3 <-with(imp, lm(outc ~ age + sex), subset=(bmi>=30))
 pool_mod3 <- pool(mod3)
 summary(pool_mod3)

I do not get an error, but the problem is: all three analysis give me exactly the same results. I thought this could be just the real life situation, however, if I use variables other than bmi (like blood pressure < 150), the same thing happens to me.

So my question is: how can I do subset analysis in R when the data is imputed using mice?

(BMI is imputed as well, I do not know if that is a problem?)

You should place subset within lm() , not outside of it.

with(imp, lm(outc ~ age + sex, subset=(bmi<30)))

A reproducible example.

with(mtcars, lm(mpg ~ disp + hp)) # Both produce the same
with(mtcars, lm(mpg ~ disp + hp), subset=(cyl < 6))    

Coefficients:
(Intercept)         disp           hp  
   30.73590     -0.03035     -0.02484  


with(mtcars, lm(mpg ~ disp + hp, subset=(cyl < 6))) # Calculates on the subset

Coefficients:
(Intercept)         disp           hp  
   43.04006     -0.11954     -0.04609 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM