简体   繁体   English

使用 MICE 进行 R 插补

[英]R Imputation With MICE

set.seed(1)
    library(data.table)
    data=data.table(STUDENT = 1:1000,
                    OUTCOME = sample(20:90, r = T),
                    X1 = runif(1000),
                    X2 = runif(1000),
                    X3 = runif(1000))
    data[, X1 := fifelse(X1 > .9, NA_real_, X1)]
    data[, X2 := fifelse(X2 > .78 & X2 < .9, NA_real_, X1)]
    data[, X3 := fifelse(X3 < .1, NA_real_, X1)]

Say you have data as shown and you wish to impute values for X1, X2, X3 and leave out STUDENT and OUTCOME for the imputation processing.假设您有如图所示的数据,并且您希望插补 X1、X2、X3 的值,并在插补处理中省略 STUDENT 和 OUTCOME。

I can do我可以

library(mice)
dataIMPUTE=mice(data[, c("X1", "X2", "X3")], m = 1)

but how do I get together the imputing values from dataIMPUTE with STUDENT and OUTCOME?但是如何将 dataIMPUTE 的估算值与 STUDENT 和 OUTCOME 结合起来? I am afraid that I will merge wrong and that is why I ask if you have advice for this.我担心我会合并错误,这就是为什么我问你是否对此有建议。

One possibility is to use the complete data set in the imputation, but change the predictorMatrix so that STUDENT and OUTCOME are not used in the imputation model.一种可能性是在插补中使用完整的数据集,但更改predictorMatrix以便在插补模型中不使用STUDENTOUTCOME

First, you need to run mice to extract the predictorMatrix (without calculating the imputation).首先,您需要运行mice来提取predictorMatrix (不计算插补)。 Then you can set all columns to 0 that shouldn't be included in the imputation model.然后,您可以将不应包含在插补模型中的所有列设置为 0。 However, all your variables are still contained in your dataIMPUTE object:但是,您的所有变量仍包含在dataIMPUTE对象中:

set.seed(1)
library(data.table)
data=data.table(STUDENT = 1:1000,
                OUTCOME = sample(20:90, r = T),
                X1 = runif(1000),
                X2 = runif(1000),
                X3 = runif(1000))
index_1 <- sample(1:1000, 100)
index_2 <- sample(1:1000, 100)
index_3 <- sample(1:1000, 100)
data[index_1, X1 := NA_real_]
data[index_2, X2 := NA_real_]
data[index_3, X3 := NA_real_]

library(mice)
init <- mice(data, maxit = 0, print = FALSE)

# extract the predictor matrix
pred_mat <- init$predictorMatrix

# remove STUDENT and OUTCOME as predictors
pred_mat[, c("STUDENT", "OUTCOME")] <- 0

# do the imputation
dataIMPUTE = mice(data, pred = pred_mat, m = 1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM