简体   繁体   English

使用 MICE 的纵向多级插补模型中的随机效应

[英]Random Effects in Longitudinal Multilevel Imputation Models Using MICE

I am trying to impute data in dataset with a longitudinal design.我正在尝试使用纵向设计来估算数据集中的数据。 There are two predictors (experimental group, and time) and one outcome variable (score).有两个预测变量(实验组和时间)和一个结果变量(分数)。 The clustering variable is id.聚类变量是 id。

Here is the toy data这是玩具数据

set.seed(345)
A0 <- rnorm(4,2,.5)
B0 <- rnorm(4,2+3,.5)
A1 <- rnorm(4,6,.5)
B1 <- rnorm(4,6+2,.5)
A2 <- rnorm(4,10,.5)
B2 <- rnorm(4,10+1,.5)
A3 <- rnorm(4,14,.5)
B3 <- rnorm(4,14+0,.5)
score <- c(A0,B0,A1,B1,A2,B2,A3,B3)
id <- rep(1:8,times = 4, length = 32)
time <- rep(0:3, each = 8, length = 32)
group <- rep(c("A","B"), times =2, each = 4, length = 32)
df <- data.frame(id = id, group = group, time = time,  score = score)

# plots
(ggplot(df, aes(x = time, y = score, group = group)) + 
    stat_summary(fun.y = "mean", geom = "line", aes(linetype = group)) +
    stat_summary(fun.y = "mean", geom = "point", aes(shape = group), size = 3) +
    coord_cartesian(ylim = c(0,18)))

# now place some NAs
df[sample(1:nrow(df), 10, replace = F),"score"] <- NA

df

If I understand this post correctly, in the predictor matrix I should specify the id clustering variable with a -2 and the two fixed predictors time and group with a 1 .如果我理解此篇正确,在预测器矩阵I应该指定id与聚类变量-2和两个固定的预测timegroup1 Like so像这样

library(mice)

(ini <- mice(df, maxit=0))
(pred <- ini$predictorMatrix)
(pred["score",] <- c(-2, 1, 1, 0))
(imp <- mice(df, 
            method = c("", "", "", "2l.pan"),
            pred = pred, 
            maxit = 1, 
            seed = 71152))

What i would like to know is:我想知道的是:

  1. Is this a longitudinal random intercepts imputation model?这是纵向随机截距插补模型吗? Specifying the id variable as -2 designates it as a 'class' variable, but in this mice primer it suggests that for multilevel models you should create a variable of all 1 's in the dataframe as a constant, which is then specified as the random intercept via 2 in the predictor matrix.将 id 变量指定为-2将其指定为“类”变量,但在此鼠标入门中,它建议对于多级模型,您应该在数据框中创建一个全为1的变量作为常量,然后将其指定为通过预测矩阵中的2随机截取。 However, this is based on the 2l.norm function rather than the 2l.pan function, so I am not really sure where I am here.但是,这是基于2l.norm函数而不是2l.pan函数,所以我不太确定我在这里的位置。 Does the 2l.pan function not require this column, or the specification of random effects? 2l.pan函数是否不需要此列或随机效应的规范?
  2. Is there any way to specify a longitudinal random-slopes model, and, if so, how?有没有办法指定纵向随机斜率模型,如果是,如何指定?

The pan library doesn't require an intercept term. pan库不需要拦截项。

You can dig into the function using您可以使用

library(pan)
?pan

That said mice uses a wrapper around pan called mice.impute.2l.pan with the mice library loaded you can look at the help for that function.也就是说, mice使用一个名为mice.impute.2l.pan pan 包装器,并加载了mice库,您可以查看该函数的帮助。 It states: it has a parameters called intercept which is [a] Logical [and] determin[es] whether the intercept is automatically added.它指出:它有一个称为intercept的参数,它是[a] Logical [and] determin[es] whether the intercept is automatically added. It is TRUE by default.默认情况下为 TRUE。 This is defined as a random intercept by default.默认情况下,这被定义为随机拦截。 Found this out after browsing the R code for the mice wrapper:在浏览鼠标包装器的 R 代码后发现了这一点:

if (intercept) { x <- cbind(1, as.matrix(x)) type <- c(2, type) }

Where the pan function parameter type is a Vector of length ncol(x) identifying random and class variables .其中pan函数参数type是一个Vector of length ncol(x) identifying random and class variablesVector of length ncol(x) identifying random and class variables The intercept is added by default and defined as a random effect.默认情况下添加截距并定义为随机效应。

They do provide and example like you stated with a 1 for "x" in the prediction matrix for fixed effects.他们确实提供了一个例子,就像你在固定效应的预测矩阵中用 1 表示“x”一样。

It also states for 2l.norm , The random intercept is automatically added in mice.impute.2l.norm().它还声明2l.normThe random intercept is automatically added in mice.impute.2l.norm().

It has a few examples with descriptions.它有一些带有描述的示例。 The CRAN documentation for pan might help you. pan的 CRAN 文档可能对您有所帮助。

This answer is probably a bit late for you, but it may be able to help some people who read this in the future:这个答案对你来说可能有点晚了,但它可能会帮助一些未来阅读这篇文章的人:

How to work with 2l.pan如何使用2l.pan

Below are some details about specifying multilevel imputation models with mice .以下是有关使用mice指定多级插补模型的一些详细信息。 Because the application is longitudinal, I use the term "persons" to refer to units at Level 2. These are the most relevant arguments for 2l.pan as mentioned in the mice documentation:因为应用程序是纵向的,我使用术语“人”来指代级别 2 的单位。这些是2l.pan最相关的参数,如mice文档中所述:

type

Vector of length ncol(x) identifying random and class variables.识别随机变量和类变量的长度为ncol(x)向量。 Random effects are identified by a 2 .随机效应由2标识。 The group variable (only one is allowed) is coded as -2 .组变量(只允许一个)被编码为-2 Random effects also include the fixed effect.随机效应还包括固定效应。 If for a covariates X1 group means shall be calculated and included as further fixed effects choose 3 .如果对于协变量X1组均值应计算并包括为进一步的固定效应,则选择3 In addition to the effects in 3 , specification 4 also includes random effects of X1 .除了3的效应,规范4还包括X1随机效应。

There are 5 different codes you can use in the predictor matrix for variables imputed with 2l.pan .您可以在预测矩阵中使用 5 种不同的代码,用于使用2l.pan插补的变量。 The person identifier is coded as -2 (this is different from 2l.norm ).人员标识符编码为-2 (这与2l.norm不同)。 To include predictor variables with fixed or random effects, these variables are coded with 1 or 2 , respectively.为了包括具有固定或随机效应的预测变量,这些变量分别用12编码。 If coded as 2 , the corresponding fixed effect is automatically included.如果编码为2 ,则自动包含相应的固定效果。

In addition, 2l.pan offers the codes 3 and 4 , which have similar meanings as 1 and 2 but will include an additional fixed effect for the person mean of that variable.此外, 2l.pan提供代码34 ,它们与12具有相似的含义,但将包括对该变量的个人均值的附加固定效应。 This is useful if you're trying to model within- and between-person effects of time-varying predictor variables.如果您尝试对时变预测变量的人内和人际效应进行建模,这将非常有用。

intercept

Logical determining whether the intercept is automatically added.逻辑判断是否自动添加拦截。

By default, 2l.pan includes the intercept as both a fixed and a random effect.默认情况下, 2l.pan包括作为固定和随机效果的截距。 For this reason, it is not required to include a constant term in the predictor matrix.因此,不需要在预测矩阵中包含常数项。 If one sets intercept=FALSE , this behavior is changed, and the intercept is dropped from the imputation model.如果设置intercept=FALSE ,则此行为会更改,并且从插补模型中删除了截距。

groupcenter.slope

If TRUE , in case of group means ( type is 3 or 4 ) group mean centering for these predictors are conducted before doing imputations.如果为TRUE ,则在组均值( type34 )的情况下,在进行插补之前对这些预测变量进行组均值居中。 Default is FALSE .默认值为FALSE

Using this option, it is possible to center predictor variables around the person mean instead of including the predictor variable "as is" (ie, without centering).使用此选项,可以将预测变量集中在人的均值周围,而不是“按原样”包括预测变量(即,不居中)。 This only applies to variables coded as 3 or 4 .这仅适用于编码为34变量。 For predictors coded as 3 , this is not very important because the models with and without centering are identical.对于编码为3预测变量,这不是很重要,因为有和没有居中的模型是相同的。

However, when predictor variables are coded as 4 (ie, with a random slope), then centering alters the meaning of the random effect so that the random slope no longer applies to the variable "as is" but to the within-person deviation of that variable.然而,当预测变量被编码为4 (即具有随机斜率)时,中心化会改变随机效应的含义,因此随机斜率不再适用于“原样”变量,而是适用于人内偏差那个变量。


In your example, you can include a simple random slope for time as follows:在您的示例中,您可以包含一个简单的time随机斜率,如下所示:

library(mice)
ini <- mice(df, maxit=0)

# predictor matrix (following 'type')
pred <- ini$predictorMatrix
pred["score",] <- c(-2, 1, 2, 0)

# imputation method
meth <- c("", "", "", "2l.pan")

imp <- mice(df, method=meth, pred=pred, maxit=10, m=10)

In this example, coding time as 3 or 4 wouldn't make a lot of sense because the person means of time are identical for all persons.在这个例子中,编码time34没有多大意义,因为人的time手段对所有人来说都是相同的。 However, if you have time-varying covariates that you want to include as predictor variables in the imputation model, 3 and 4 can be useful.但是,如果您希望将时变协变量作为预测变量包含在插补模型中,则34可能很有用。

The additional arguments like intercept and groupcenter.slope can be specified directly in the call to mice() , for example:可以在调用groupcenter.slope mice()直接指定诸如interceptgroupcenter.slope类的附加参数,例如:

imp <- mice(df, ..., groupcenter.slope=TRUE)

Regarding your Questions关于您的问题

So, to answer your questions as stated in the post:因此,按照帖子中的说明回答您的问题:

  1. Yes, 2l.pan provides a multilevel (or rather two-level) imputation model.是的, 2l.pan提供了一个多级(或者更确切地说是两级)插补模型。 The intercept is included as both a fixed and a random effect by default (can be changed with intercept=FALSE ) and need not be specified in the predictor matrix (this is in contrast to 2l.norm ).默认情况下,截距作为固定和随机效应包括在内(可以用intercept=FALSE更改)并且不需要在预测矩阵中指定(这与2l.norm形成对比)。

  2. Yes, you can specify random slopes with 2l.pan .是的,您可以使用2l.pan指定随机斜率。 To do that, predictors with random slopes are coded as 2 or 4 in the predictor matrix.为此,具有随机斜率的预测变量在预测变量矩阵中编码为24 If coded as 2 , the random slope is included.如果编码为2 ,则包括随机斜率。 If coded as 4 , the random slope is included as well as an additional fixed effect for the person means of that variable.如果编码为4 ,则包括随机斜率以及该变量的个人均值的附加固定效应。 If coded as 4 , the meaning of the random slope may be altered by making use of groupcenter.slope=TRUE (see above).如果编码为4 ,则可以通过使用groupcenter.slope=TRUE (见上文)来改变随机斜率的含义。

This article also includes some worked examples for how to work with 2l.pan and other functions for mutlivel imputation: [Link]本文还包括一些关于如何使用2l.pan和其他函数进行多重插补的工作示例: [链接]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM