简体   繁体   English

如何为具有两个固定效应的混合效应模型编写 lmer 公式

[英]How to write lmer formula for mixed effects model with two fixed effects

I'm new to linear mixed effects models and I'm trying to use them for hypothesis testing.我是线性混合效应模型的新手,我正在尝试将它们用于假设检验。

In my data ( DF ) I have two categorical/factor variables: color (red/blue/green) and direction (up/down).在我的数据( DF )中,我有两个分类/因子变量: color (红色/蓝色/绿色)和direction (向上/向下)。 I want to see if there are significant differences in scores (numeric values) across these factors and if there is an interaction effect, while accounting for random intercepts and random slopes for each participant .我想看看这些因素的scores (数值)是否存在显着差异,以及是否存在交互效应,同时考虑每个participant随机截距和随机斜率。

What is the appropriate lmer formula for doing this?这样做的合适lmer公式是什么?


Here's what I have...这是我所拥有的...

My data is structured like so:我的数据结构如下:

> str(DF)

'data.frame':   4761 obs. of  4 variables:
 $ participant     : Factor w/ 100 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ direction       : Factor w/ 2 levels "down","up": 2 2 2 2 2 2 2 2 2 2 ...
 $ color           : Factor w/ 3 levels "red","blue",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ scores          : num  15 -4 5 25 0 3 16 0 5 0 ...

After some reading, I figured that I could write a model with random slopes and intercepts for participants and one fixed effect like so:经过一些阅读,我想我可以为参与者编写一个具有随机斜率和截距以及一个固定效果的模型,如下所示:

model_1 <- lmer(scores ~ direction + (direction|participant), data = DF) 

This gives me a fixed effect estimate and p-value for direction , which I understand to be a meaningful assessment of the effect of direction on scores while individual differences across participants are accounted for as a random effect.这给了我一个固定的效果估计和direction p 值,我认为这是对directionscores影响的有意义的评估,而参与者之间的个体差异被视为随机效应。

But how do I add in my second fixed factor, color , and an interaction term whilst still affording each participant a random intercept and slope?但是我如何添加我的第二个固定因素color和交互项,同时仍然为每个参与者提供随机截距和斜率?

I thought maybe I could do this:我想也许我可以这样做:

model_2 <- lmer(scores ~ direction * color + (direction|participant) + (color|participant), data = DF) 

But ultimately I really don't know what exactly this formula means.但最终我真的不知道这个公式到底是什么意思。 Any guidance would be appreciated.任何指导将不胜感激。

You can include several random slopes in at least two ways:您可以至少以两种方式包含多个随机斜率:

  1. What you proposed: Estimate random slopes for both predictors, but don't estimate the correlation between them (ie assume the random slopes of different predictors don't correlate):您提出的建议:估计两个预测变量的随机斜率,但不要估计它们之间的相关性(即假设不同预测变量的随机斜率不相关):
    scores ~ direction * color + (direction|participant) + (color|participant)

  2. The same but also estimate the correlation between random slopes of different predictors:相同但也估计不同预测变量的随机斜率之间的相关性:
    scores ~ direction * color + (direction + color|participant)

Please note two things:请注意两点:

First, in both cases, the random intercepts for "participant" are included, as are correlations between each random slope and the random intercept.首先,在这两种情况下,都包括了“参与者”的随机截距,以及每个随机斜率和随机截距之间的相关性。 This probably makes sense unless you have theoretical reasons to the contrary.除非您有相反的理论原因,否则这可能是有道理的。 See this useful summary if you want to avoid the correlation between random intercepts and slopes.如果您想避免随机截距和斜率之间的相关性,请参阅有用的摘要。

Second, in both cases you don't include a random slope for the interaction term!其次,在这两种情况下,您都不包括交互项的随机斜率! If the interaction effect is actually what you are interested in, you should at least try to fit a model with random slopes for it so to avoid potential bias in the fixed interaction effect.如果交互效应实际上是您感兴趣的,您至少应该尝试为它拟合一个具有随机斜率的模型,以避免固定交互效应中的潜在偏差。 Here, again, you can choose to allow or avoid correltions between the interaction term's random slopes and other random slopes:在这里,您可以再次选择允许或避免交互项的随机斜率与其他随机斜率之间的相关性:
Without correlation: scores ~ direction * color + (direction|participant) + (color|participant) + (direction:color|participant)相关性: scores ~ direction * color + (direction|participant) + (color|participant) + (direction:color|participant)
With correlation: scores ~ direction * color + (direction * color|participant)相关性: scores ~ direction * color + (direction * color|participant)

If you have no theoretical basis to decide between models with or without correlations between the random slopes, I suggest you do both, compare them with anova() and choose the one that fits your data better.如果您没有理论基础来决定随机斜率之间有或没有相关性的模型,我建议您同时进行,将它们与anova()进行比较,然后选择更适合您的数据的模型。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM