stepcAIC-eval（predvars，data，env）中的错误：找不到对象'Color1'

Question

I want to select the optimal random structure for my mixed effects model (fitted with lmer() from lme4 ). 我想为我的混合效果模型（与lmer() lme4 ）选择最佳的随机结构。 I found the function stepcAIC() from the package cAIC4 , which is supposed to compare models and select the one with the smallest AIC in a stepwise fashion. 我从软件包cAIC4找到了函数stepcAIC() ，该函数应该比较模型并逐步选择具有最小AIC的模型。 Although the implementation looks very simple, I get an error. 尽管实现看起来很简单，但是我得到了一个错误。

After fitting my model, I ran the following function: 拟合模型后，我运行以下功能：

stepcAIC(model_full, direction="backward")

So first - it takes forever to run. 所以首先-它需要永远运行。 Second - I get an error message. 第二-我收到一条错误消息。 I tried explicitly specifying the dataset: 我尝试显式指定数据集：

stepcAIC(model_full, direction="backward", data=data_correct)

I also tried to update R to the newest version and then ran it again, but it doesn't help. 我还尝试将R更新到最新版本，然后再次运行它，但这无济于事。

Does anyone have a positive experience with this function to tell me what I did wrong? 有人对这个功能有积极的经验告诉我我做错了什么吗？

The error I get is this: 我得到的错误是这样的：

Error in eval(predvars, data, env) : object 'Color1' not found eval（predvars，data，env）中的错误：找不到对象'Color1'

I have a variable named "Color", but not "Color1". 我有一个名为“ Color”的变量，但没有“ Color1”。 Perhaps "Color1" is a name taken from the table of the effects, but then why would it use the name from the summary table and search for it in the data frame? 也许“ Color1”是从效果表中获取的名称，但是为什么它要使用汇总表中的名称并在数据框中搜索呢？

I also get warnings: 我也收到警告：

In if (!hasInt(resForThisGroup)) res[[i]] <- res[[i]][-j] : the condition has length > 1 and only the first element will be used 如果if（！hasInt（resForThisGroup））res [[i]] <-res [[i]] [-j]：条件的长度> 1，则仅使用第一个元素

Here is a [link]( https://drive.google.com/open?id=1jIJn2rzK3SwpKMfKGDhseYcOxinuwpue ) to download data_correct and model_full : 这是[link]（ https://drive.google.com/open?id=1jIJn2rzK3SwpKMfKGDhseYcOxinuwpue ），用于下载data_correct和model_full ：

This is how I created model_full : 这就是我创建model_full ：

model_full <- lmer(data=data_correct, log_RT~Polarity+Delay+Truth_value+Type+Color+Order + Polarity:Delay + Polarity:Truth_value + Polarity:Order + Polarity:Type+ Polarity:Color + Delay:Truth_value+ Truth_value:Delay:Polarity + (1+Polarity*Color+Delay+Delay:Polarity+Truth_value|Subject), control=lmerControl(optimizer="bobyqa"), REML=FALSE)

This is the output of model_full : 这是model_full的输出：

Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log_RT ~ Polarity + Delay + Truth_value + Type + Color + Order +  
    Polarity:Delay + Polarity:Truth_value + Polarity:Order +  
    Polarity:Type + Polarity:Color + Delay:Truth_value + Truth_value:Delay:Polarity +  
    (1 + Polarity * Color + Delay + Delay:Polarity + Truth_value |          Subject)
   Data: data_correct
Control: lmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
 16556.6  16896.2  -8235.3  16470.6    19838 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9078 -0.6585 -0.1065  0.5654  6.5045 

Random effects:
 Groups   Name             Variance  Std.Dev. Corr                               
 Subject  (Intercept)      0.0652479 0.25544                                     
          Polarity1        0.0045472 0.06743   0.51                              
          Color1           0.0030415 0.05515   0.15  0.13                        
          Delay1           0.0005240 0.02289   0.22 -0.05 -0.02                  
          Truth_value1     0.0022027 0.04693   0.00  0.48  0.23  0.00            
          Polarity1:Color1 0.0003927 0.01982   0.04 -0.33  0.57 -0.50 -0.12      
          Polarity1:Delay1 0.0001981 0.01408   0.61  0.07  0.06  0.55  0.06 -0.04
 Residual                  0.1304137 0.36113                                     
Number of obs: 19881, groups:  Subject, 38

Fixed effects:
                                Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)                    6.572e+00  4.152e-02  3.800e+01 158.301  < 2e-16 ***
Polarity1                      1.234e-01  1.124e-02  3.797e+01  10.985 2.38e-13 ***
Delay1                        -6.476e-02  4.512e-03  3.817e+01 -14.352  < 2e-16 ***
Truth_value1                   5.266e-02  8.034e-03  3.805e+01   6.556 9.83e-08 ***
Type1                          7.531e-03  2.562e-03  1.962e+04   2.939 0.003292 ** 
Color1                         2.512e-02  9.308e-03  3.756e+01   2.698 0.010379 *  
Order1                        -3.524e-02  8.981e-03  3.794e+01  -3.924 0.000354 ***
Polarity1:Delay1              -2.244e-02  3.433e-03  3.834e+01  -6.538 1.00e-07 ***
Polarity1:Truth_value1        -5.728e-02  2.563e-03  1.963e+04 -22.347  < 2e-16 ***
Polarity1:Order1              -1.250e-02  3.547e-03  3.823e+01  -3.525 0.001119 ** 
Polarity1:Type1               -7.107e-03  2.562e-03  1.962e+04  -2.774 0.005544 ** 
Polarity1:Color1               4.012e-03  4.114e-03  3.790e+01   0.975 0.335639    
Delay1:Truth_value1            5.301e-03  2.563e-03  1.963e+04   2.068 0.038629 *  
Polarity1:Delay1:Truth_value1  9.625e-03  2.563e-03  1.963e+04   3.755 0.000174 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Answer 1

(Only sort-of an answer; will delete later if appropriate.) （仅对答案进行排序；如果合适，稍后将删除。）

I can't replicate your problem because your data set is too big for the machine I'm working on at the moment; 我无法复制您的问题，因为您的数据集对于目前正在使用的计算机而言太大； when I try to run stepcAIC(model_full, direction="backward") I get: 当我尝试运行stepcAIC(model_full, direction="backward")我得到：

The cAIC of the initial model can not be calculated. 无法计算初始模型的cAIC。

which is explained by the message from cAIC(model_full) : 由cAIC(model_full)的消息解释：

Error: cannot allocate vector of size 2.9 Gb 错误：无法分配大小为2.9 Gb的向量

This is perhaps not surprising, as the model is moderately large (~20K observations, 28 parameters). 这也许不足为奇，因为该模型适中（约20K观测值，28个参数）。 (Digging into the code, we can see that the model is trying to construct a dense identity matrix with dimensions equal to the number of observations - in this case n * n * 8 bytes is nearly 3 Gb ...) （深入研究代码，我们可以看到该模型正在尝试构建一个尺寸等于观察次数的密集单位矩阵，在这种情况下， n * n * 8 bytes接近3 Gb ...）

Computing cAIC is really only necessary if you want to select models on the basis of individual-level predictions; 仅当您要根据个人水平的预测选择模型时，才需要计算cAIC。 if you want to select on the basis of population-level predictions, AIC should be acceptable (and is computationally much cheaper). 如果要基于人口水平的预测进行选择，则AIC应该可以接受（并且计算上便宜得多）。 The simplest selection procedure is based on p-values (I don't like it because I don't think modeling decisions should be based on significance testing, but lots of people use it). 最简单的选择过程基于p值（我不喜欢它，因为我认为建模决策不应该基于重要性测试，但是很多人使用它）。

The step() function in lmerTest will do p-value based backward selection: lmerTest的step()函数将基于p值进行向后选择：

system.time(ss <- step(model_full,reduce.fixed=FALSE))

takes about 4.5 minutes on my old laptop. 我的旧笔记本电脑大约需要4.5分钟。 The result (abbreviated) is that it tests the effect of dropping Truth_value , Polarity:Color , and Polarity:Delay from the random effects, and concludes that it shouldn't drop any of them. 结果（略）是它测试了从随机效果中删除Truth_value ， Polarity:Color和Polarity:Delay的效果，并得出结论，它不应删除其中的任何一个。

Backward reduced random-effect table:

                     Eliminated npar  logLik   AIC     LRT Df Pr(>Chisq)    
<none>                            43 -8235.3 16557                          
T_i(1+P*C+D+D:P+T_|S          0   36 -8366.3 16804 261.915  7  < 2.2e-16 ***
P:Ci(1+P*C+D+D:P+T|S          0   36 -8257.1 16586  43.693  7  2.451e-07 ***
P:Di(1+P*C+D+D:P+T|S          0   36 -8245.0 16562  19.507  7   0.006739 ** 
---

?step.lmerModLmerTest

... a column '"Eliminated"' indicating the order in which terms are eliminated from the model with zero ('0') indicating that the term is not eliminated from the model. ...列“ Eliminated”（“已消除”）表示从模型中消除术语的顺序，零（“ 0”）表示未从模型中消除术语。

In this case the step() function has tried to drop all of the highest-order terms (two-way interactions + main effect of Truth_value , which isn't involved in an interaction), and found that it doesn't want to drop any of them. 在这种情况下， step()函数尝试删除所有最高阶的项（双向交互+ Truth_value主要作用，该交互未涉及），并且发现它不想删除任何一位。 In this case the p-value criteria (all terms have p<0.05) and the AIC criteria (all reduced models have AIC larger than the original model) agree with each other. 在这种情况下，p值标准（所有项的p <0.05）和AIC标准（所有精简模型的AIC都大于原始模型）相互一致。

stepcAIC-eval（predvars，data，env）中的错误：找不到对象'Color1'

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-09-04 19:18:44

stepcAIC-eval（predvars，data，env）中的错误：找不到对象&#39;Color1&#39;

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-09-04 19:18:44

stepcAIC-eval（predvars，data，env）中的错误：找不到对象'Color1'

解决方案1
1 已采纳 2019-09-04 19:18:44