简体   繁体   English

方差分析-比较R中的3组

[英]ANOVA - comparing 3 groups in R

I am attempting to analyze a data set for a research project but have ran into a lot of issues, and have not been able to find a directly related answer online. 我正在尝试分析一个研究项目的数据集,但是遇到了很多问题,并且无法在线找到直接相关的答案。 I have worked with other statistical programs but am new to R. I have had the hardest time figuring out how to shape my data set to best answer questions. 我曾与其他统计程序一起工作,但对R还是陌生的。我最费劲的时间是想出如何调整我的数据集以最佳回答问题的方法。

In this research participants were asked to answer questions about pictures they were presented, these pictures were of faces exhibiting 3 emotions (happy, angry, sad) - I now want to compare answers given to each question in regards to those pictures. 在这项研究中,要求参与者回答有关所展示图片的问题,这些图片是表现出3种情绪(快乐,愤怒,悲伤)的面孔-我现在想比较针对这些图片针对每个问题给出的答案。 Meaning I want to see if there are differences between these three groups. 意思是我想看看这三组之间是否有差异。

I have used a 1 way ANOVA in the past for doing this - in minitab I would put the images into 3 factors (1,2,3) and then the scores for the given question in the column next to it. 过去,我使用1种方差分析来执行此操作-在minitab中,我会将图像分为3个因素(1、2、3),然后将其放在给定问题的分数旁边。 So the specific picture and the score for the particular question would be lined up horizontally. 因此,特定问题的具体图片和分数将水平排列。

  Image pleasing
1     1        3
2     1        2
3     1        1
4     1        1
5     1        1
6     1        2

This is how I have it set up in R as well - but when I try to run an ANOVA I cannot because image is still the class of Integer and not a factor. 这也是我在R中设置它的方式-但是当我尝试运行ANOVA时,由于图像仍然是Integer类而不是一个因素,所以我无法执行。 Therefor it gives me this: 因此,它给了我这个:

> Paov <- aov(Image ~ pleasing)
> summary(Paov)
             Df Sum Sq Mean Sq F value Pr(>F)
pleasing      1    0.7  0.6546   0.978  0.323
Residuals   813  544.3  0.6696               
26 observations deleted due to missingness

and then a post-hoc Tukey's test is meaningless. 然后事后Tukey的测试毫无意义。 In minitab it was able to show me the mean score for pleasing as it related to each image and then tell me how they are significantly different. 在minitab中,它可以显示与每个图像相关的令人愉悦的平均得分,然后告诉我它们之间有何显着不同。 How can I make Image a factor in R? 如何在R中使图像成为因素? And then how can I properly compare these three groups in there scores of pleasing? 然后我如何才能在这十个令人愉悦的分数之间正确地比较这三组呢?

Given the description of your data, here's a way to perform the analysis of variance and the Tukey test. 根据数据的描述,这是一种进行方差分析和Tukey检验的方法。 First, some not-so-random data (which will give "interesting" results): 首先,一些不太随机的数据(将给出“有趣的”结果):

set.seed(40)
dat <- data.frame(Image = factor(rep(1:3, each=10)), 
                  Pleasing = c(sample(1:2, 10, replace=T),
                               sample(c(1,3), 10, replace=T),
                               sample(2:3, 10, replace=T)))
head(dat)
#   Image Pleasing
# 1     1        2
# 2     1        2
# 3     1        2
# 4     1        1
# 5     1        1
# 6     1        1

The aov is quite simple. aov很简单。 Just note you have to use data if your variables are in a dataframe (using attach isn't recommended): 请注意,如果变量位于数据框中,则必须使用data (不建议使用attach ):

dat.aov <- aov(Pleasing ~ Image, data=dat)
summary(dat.aov)
#             Df Sum Sq Mean Sq F value  Pr(>F)   
# Image        2    7.2   3.600   6.568 0.00474 **
# Residuals   27   14.8   0.548                   
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Now for the Tukey, there are different ways do it in R. I like to use the package multcomp because it provides more information with the results: 现在,对于Tukey,在R中有不同的方法。我喜欢使用包multcomp因为它提供了更多有关结果的信息:

library(multcomp)

tukey <- cld(glht(dat.aov, linfct = mcp(Image = "Tukey")), decreasing = TRUE)

tukey$mcletters$Letters
#  1    2    3 
# "b" "ab"  "a" 

The syntax looks rather complicated because in multcomp you use a general linear hypothesis function ( glht ), in which you perform a multiple comparison ( mcp ) and then extract the compact letter display of the Tukey results ( cld ). 语法看起来相当复杂,因为在multcomp中,您使用常规的线性假设函数( glht ),在其中执行多次比较( mcp ),然后提取Tukey结果的紧凑字母显示( cld )。

You can even plot the Tukey results, although the boxplots don't look very nice for this kind of data: 您甚至可以绘制Tukey结果,尽管箱形图对于这种数据看起来不太好:

在此处输入图片说明

As a final note, it's important to mention that I use this kind of analysis for continuous data (experimental lab measures), and I'm not sure it's correct for your categorical data (1-3 expression choice). 最后一点,重要的是要提到我对连续数据使用了这种分析(实验性实验室测量),并且我不确定对于您的分类数据(1-3个表达选择)是否正确。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM