简体   繁体   English

R:混合线性 model 中的膨胀自由度

[英]R: Inflated degrees of freedom in mixed linear model

I have a question regarding a mixed model I am using: In a study, participants have been presented with 40 different news article headlines and indicated for each headline whether they would share the headline or not (Yes coded as 1, No coded as 0).我有一个关于我正在使用的混合 model 的问题:在一项研究中,参与者看到了 40 个不同的新闻文章标题,并为每个标题指示他们是否会分享标题(是编码为 1,否编码为 0) . There are the two binary within-subjects factors “Accuracy” (true vs. false) and “Strategy” (attacks outgroup vs. praises ingroup).被试内有两个二元因素“准确性”(真与假)和“策略”(外群攻击与内群赞美)。 Further, there is a binary between-subjects factor “Condition” (threat vs. neutral).此外,受试者之间存在一个二元因素“条件”(威胁与中立)。

I wanted to run a mixed model with random intercepts for participants (id) and headlines (Headline) that includes sharing decision as a dependent variable and Accuracy, Strategy and Condition as independent variables.我想运行一个混合的 model,随机截取参与者(id)和标题(标题),其中包括共享决策作为因变量,准确性、策略和条件作为自变量。 I have two issues with that.我有两个问题。

When I try to use a multilevel logistic regression with the following command, I am running into convergence issues:当我尝试通过以下命令使用多级逻辑回归时,我遇到了收敛问题:

mreg_P3_g <- glmer(
   Sharing_P3 ~ (1 | id) + (1 | Headline) + Strategy * Accuracy * Condition, 
   data=df,
   family="binomial"
)

Therefore, I tried to run a linear model with the following command:因此,我尝试使用以下命令运行线性 model:

mreg_P3 <- lmer(
  Sharing_P3 ~ Strategy * Accuracy * Condition + 
  (1|Headline) + (1|id),
  data=df
)

When I do that, I receive the following output:当我这样做时,我收到以下 output:

        Type III Analysis of Variance Table with Kenward-Roger's method
                            Sum Sq Mean Sq NumDF  DenDF  F value    Pr(>F)    
Strategy                     0.828   0.828     1   35.1   7.5283  0.009505 ** 
Accuracy                    80.154  80.154     1   35.1 729.1441 < 2.2e-16 ***
Condition                    0.030   0.030     1  195.7   0.2728  0.602041    
Strategy:Accuracy            0.528   0.528     1   35.1   4.8006  0.035180 *  
Strategy:Condition           0.462   0.462     1 3723.1   4.2026  0.040431 *  
Accuracy:Condition           0.741   0.741     1 3723.1   6.7425  0.009451 ** 
Strategy:Accuracy:Condition  0.457   0.457     1 3723.1   4.1596  0.041468 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

As you probably see, I am having a lot of significant effects and the effects Strategy:Condition, Accuracy:Condition and Strategy:Accuracy:Condition are not interpretable when looking the visualized data.正如您可能看到的那样,我有很多重要的影响,并且在查看可视化数据时,影响 Strategy:Condition、Accuracy:Condition 和 Strategy:Accuracy:Condition 是不可解释的。 I attribute the fact that they are significant to the inflated degrees of freedom and wonder if I need to specify the random effects of my model differently.我将它们的重要性归因于膨胀的自由度,我想知道我是否需要以不同的方式指定我的 model 的随机效应。

I am far from an expert and would be very happy for any help!我远非专家,很乐意提供任何帮助! Thank you very much in advance!非常感谢您!

I suspect that there is nothing wrong with the degrees-of-freedom (DF) estimates.我怀疑自由度 (DF) 估计没有任何问题。 If you have 100 participants each having 40 headlines to evaluate, you have 4000 observations.如果您有 100 个参与者,每个参与者有 40 个要评估的标题,那么您就有 4000 个观察值。 The DF used to evaluate those interaction terms should represent the number of observations minus the DF used up for other aspects of the model.用于评估这些交互项的 DF 应表示观察数减去 model 其他方面使用的 DF。

What seems more likely in your lmer() model is that you have "statistically significant" effects with your interaction terms that aren't practically significant, given the magnitude of the Accuracy effect.在您的lmer() model 中似乎更有可能的是,考虑到Accuracy效应的大小,您的交互项具有“统计显着”效应,但实际上并不显着。 Practical and statistical significance aren't the same thing, particularly with large sample sizes.实际意义和统计意义不是一回事,尤其是在大样本量的情况下。

That said, you should be paying attention to why the binomial model isn't converging.也就是说,您应该注意为什么二项式 model 不收敛。 The lmer() model is seldom appropriate for binary outcomes and might give you probabilities below 0 or above 1. You don't say what the problem is, but logistic regression can run into perfect separation . lmer() model 很少适用于二元结果,可能会给您低于 0 或高于 1 的概率。您没有说问题出在哪里,但逻辑回归可能会遇到完美分离 It's also possible that the default solver or number of iterations weren't adequate for the size and nature of your data set.默认求解器或迭代次数也可能不足以满足数据集的大小和性质。 The above explanation of interaction effects that "are not interpretable when looking the visualized data" would still hold.上面对“在查看可视化数据时无法解释”的交互效果的解释仍然成立。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM