简体   繁体   English

重复测量ANOVA:ezANOVA与aov vs. lme语法

[英]Repeated-Measures ANOVA: ezANOVA vs. aov vs. lme syntax

This question is both about syntax and semantics, thus please find a (yet unanswered) duplicate on Cross-Validated: https://stats.stackexchange.com/questions/113324/repeated-measures-anova-ezanova-vs-aov-vs-lme-syntax 这个问题是关于语法和语义的,因此请在Cross-Validated上找到一个(但尚未答复的)副本: https//stats.stackexchange.com/questions/113324/repeated-measures-anova-ezanova-vs-aov-vs -lme语法

In the machine-learning domain, I evaluated 4 classifiers on the same 5 datasets, ie each classifier returned a performance measure for dataset 1, 2, 3, ... and 5. Now I want to know whether the classifiers differ significantly in their performance. 在机器学习领域,我在相同的5个数据集上评估了4个分类器,即每个分类器返回数据集1,2,3,...和5的性能度量。现在我想知道分类器是否在它们的显着不同性能。 Here's some toy data: 这是一些玩具数据:

Performance<-c(2,3,3,2,3,1,2,2,1,1,3,1,3,2,3,2,1,2,1,2)
Dataset<-factor(c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5))
Classifier<-factor(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4))
data<-data.frame(Classifier,Dataset,Performance)

Following a textbook, I conducted a repeated-measures one-way ANOVA. 在教科书之后,我进行了重复测量的单因素方差分析。 I interpreted my performance as dependent variable, the classifiers as subjects and the datasets as within-subjects factor. 我将我的表现解释为因变量,将分类器解释为主体,将数据集解释为主体内因子。 Using aov, I got: 使用aov,我得到了:

model <- aov(Performance ~ Classifier + Error(factor(Dataset)), data=data)
summary(model)

Yielding the following output: 产生以下输出:

Error: factor(Dataset)
           Df Sum Sq Mean Sq F value Pr(>F)
Residuals  4    2.5   0.625               

Error: Within
            Df Sum Sq Mean Sq F value Pr(>F)  
Classifier  3    5.2  1.7333   4.837 0.0197 *
Residuals  12    4.3  0.3583 

I get similar results when using a linear mixed-effects model: 使用线性混合效果模型时,我得到类似的结果:

model <- lme(Performance ~ Classifier, random = ~1|Dataset/Classifier,data=data)
result<-anova(model)

I then tried to reproduce the results with ezANOVA in order to perform Mauchlys test for Sphericity: 然后,我尝试用ezANOVA重现结果,以便对Sphericity执行Mauchlys测试:

 ezANOVA(data=data, dv=.(Performance), wid=.(Classifier), within=.(Dataset), detailed=TRUE, type=3)

Yielding the following output: 产生以下输出:

        Effect DFn DFd  SSn SSd         F          p p<.05       ges
 1 (Intercept)   1   3 80.0 5.2 46.153846 0.00652049     * 0.8938547
 2     Dataset   4  12  2.5 4.3  1.744186 0.20497686       0.2083333

This clearly doesn't correspond to the prior output with aov/lme. 这显然与aov / lme的先前输出不对应。 Nevertheless, when I exchange "Performance" with "Classifier" in the ezANOVA definition, I get the expected results. 然而,当我在ezANOVA定义中将“Performance”与“Classifier”交换时,我得到了预期的结果。

I now wonder whether my textbook is wrong (aov definition) or if I misunderstood the ezANOVA syntax. 我现在想知道我的教科书是错误的(aov定义)还是我误解了ezANOVA语法。 Furthermore, why do I only get Mauchly's test results when rewriting the ezANOVA statement, but not in the first case? 此外,为什么我在重写ezANOVA语句时只获得Mauchly的测试结果,但在第一种情况下却没有?

Since you want to compare classifiers and not datasets, the within factor is classifier and the within ID is dataset. 由于您要比较分类器而不是数据集,因此内部因子是分类器,内部ID是数据集。 So the correct syntax for your ezANOVA example would be: 因此,您的ezANOVA示例的正确语法是:

ezANOVA(data=data, dv=.(Performance), within=.(Classifier), wid=.(Dataset), detailed=TRUE)

Btw, there is no need to specifiy the type of sums of squares. 顺便说一下,没有必要指定平方和的类型。 Since you have only one factor all types of sums of squares will produce the same results anyway. 由于您只有一个因素,因此无论如何所有类型的平方和都会产生相同的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM