简体   繁体   English

2向ANOVA,R是否比较正确的数据?

[英]2-way ANOVA, is R comparing the correct data?

Year Location AVGCover
2010     1      0.1  
2010     1      0.5
2010     1      1
2010     2      0.75
2010     2      0.8  
2010     2      1.6
2010     3      1.1
2010     3      0.5
2010     3      0.6
2011     1      0.2
2011     1      0.2
2011     1      0.3
2011     2      0.5
2011     2      0.7
2011     2      0.4
2011     3      0.6
2011     3      0.1
2011     3      0

I have made a small subset of my data set, it looks at avg percentage cover in 3 locations over 2 years. 我制作了数据集的一小部分,它查看了2年中3个位置的平均百分比覆盖率。 I believe I will need to do a 2-way ANOVA as a statistical test, however, I'm having some trouble. 我相信我将需要进行2次方差分析作为统计测试,但是,我遇到了一些麻烦。 This is my code so far: 到目前为止,这是我的代码:

anova(mod1 <- lm(df$AVGCover ~ df$Location + df$Year + df$Location * 
      df$Year)

pairwise.t.test(df$AVGCover, df$Year, p.adj = "none")
pairwise.t.test(df$AVGCover, df$Location, p.adj = "none")

Specifically I wish to look at the comparisons of eg Location 1 in 2010 and 2011, but when I run my pairwise.t.test R only compares eg Location 1 and Location 2 etc. I want to be sure that my R code is specifically looking at the comparisons I want, but I'm generally uncertain, so I hoping for some help. 具体来说,我希望查看2010年和2011年的位置1的比较,但是当我运行pairwise.t.test R时,仅比较位置1和位置2等。我想确保我的R代码正在寻找在进行我想要的比较时,但是我通常不确定,因此希望能有所帮助。

One last thing my ANOVA output says that my df = 1, I'm not so sure this should be the case. 我的方差分析输出的最后一句话是我的df = 1,我不确定情况是否如此。 Where am I going wrong? 我要去哪里错了?

You should create your data with the correct structure. 您应该使用正确的结构创建数据。 Both Year and Location are clearly discrete, ie R factors. Year和Location显然是离散的,即R因子。 and you should use the R formula interface. 并且您应该使用R公式界面。 The formula AVGCover ~ Location * Year includes all main effects: 公式AVGCover ~ Location * Year包括所有主要影响:

txt <- "Year Location AVGCover
 2010     1      0.1  
 2010     1      0.5
 2010     1      1
 2010     2      0.75
 2010     2      0.8  
 2010     2      1.6
 2010     3      1.1
 2010     3      0.5
 2010     3      0.6
 2011     1      0.2
 2011     1      0.2
 2011     1      0.3
 2011     2      0.5
 2011     2      0.7
 2011     2      0.4
 2011     3      0.6
 2011     3      0.1
 2011     3      0"
 dfrm <- read.table(text=txt, header=TRUE, colClasses=c("factor", "factor", "numeric") )

Notice I didn't use df as a name since that is the function name of the density of the F-distribution. 注意,我没有使用df作为名称,因为那是F分布密度的函数名称。

anova(mod1 <- lm(AVGCover ~ Location * Year, data=dfrm))
Analysis of Variance Table

Response: AVGCover
              Df  Sum Sq Mean Sq F value  Pr(>F)  
Location       2 0.54361 0.27181  2.4555 0.12767  
Year           1 0.86681 0.86681  7.8306 0.01609 *
Location:Year  2 0.04361 0.02181  0.1970 0.82380  
Residuals     12 1.32833 0.11069                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> pairwise.t.test(dfrm$AVGCover, dfrm$Year, p.adj = "none")

    Pairwise comparisons using t tests with pooled SD 

data:  dfrm$AVGCover and dfrm$Year 

     2010 
2011 0.016

P value adjustment method: none 
> pairwise.t.test(dfrm$AVGCover, dfrm$Location, p.adj = "none")

To get the magnitude of the estimated difference you need to pass a dataframe with the same names (and class) as the IV's on the RHS of the formula: 为了获得估计差异的大小,您需要传递与公式RHS上的IV具有相同名称(和类)的数据框:

predict(mod1, newdata =data.frame(Location=factor(1), Year=factor(c(2010,2011)) ) )
 # returns
        1         2 
0.5333333 0.2333333 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM