简体   繁体   中英

2-way ANOVA, is R comparing the correct data?

Year Location AVGCover
2010     1      0.1  
2010     1      0.5
2010     1      1
2010     2      0.75
2010     2      0.8  
2010     2      1.6
2010     3      1.1
2010     3      0.5
2010     3      0.6
2011     1      0.2
2011     1      0.2
2011     1      0.3
2011     2      0.5
2011     2      0.7
2011     2      0.4
2011     3      0.6
2011     3      0.1
2011     3      0

I have made a small subset of my data set, it looks at avg percentage cover in 3 locations over 2 years. I believe I will need to do a 2-way ANOVA as a statistical test, however, I'm having some trouble. This is my code so far:

anova(mod1 <- lm(df$AVGCover ~ df$Location + df$Year + df$Location * 
      df$Year)

pairwise.t.test(df$AVGCover, df$Year, p.adj = "none")
pairwise.t.test(df$AVGCover, df$Location, p.adj = "none")

Specifically I wish to look at the comparisons of eg Location 1 in 2010 and 2011, but when I run my pairwise.t.test R only compares eg Location 1 and Location 2 etc. I want to be sure that my R code is specifically looking at the comparisons I want, but I'm generally uncertain, so I hoping for some help.

One last thing my ANOVA output says that my df = 1, I'm not so sure this should be the case. Where am I going wrong?

You should create your data with the correct structure. Both Year and Location are clearly discrete, ie R factors. and you should use the R formula interface. The formula AVGCover ~ Location * Year includes all main effects:

txt <- "Year Location AVGCover
 2010     1      0.1  
 2010     1      0.5
 2010     1      1
 2010     2      0.75
 2010     2      0.8  
 2010     2      1.6
 2010     3      1.1
 2010     3      0.5
 2010     3      0.6
 2011     1      0.2
 2011     1      0.2
 2011     1      0.3
 2011     2      0.5
 2011     2      0.7
 2011     2      0.4
 2011     3      0.6
 2011     3      0.1
 2011     3      0"
 dfrm <- read.table(text=txt, header=TRUE, colClasses=c("factor", "factor", "numeric") )

Notice I didn't use df as a name since that is the function name of the density of the F-distribution.

anova(mod1 <- lm(AVGCover ~ Location * Year, data=dfrm))
Analysis of Variance Table

Response: AVGCover
              Df  Sum Sq Mean Sq F value  Pr(>F)  
Location       2 0.54361 0.27181  2.4555 0.12767  
Year           1 0.86681 0.86681  7.8306 0.01609 *
Location:Year  2 0.04361 0.02181  0.1970 0.82380  
Residuals     12 1.32833 0.11069                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> pairwise.t.test(dfrm$AVGCover, dfrm$Year, p.adj = "none")

    Pairwise comparisons using t tests with pooled SD 

data:  dfrm$AVGCover and dfrm$Year 

     2010 
2011 0.016

P value adjustment method: none 
> pairwise.t.test(dfrm$AVGCover, dfrm$Location, p.adj = "none")

To get the magnitude of the estimated difference you need to pass a dataframe with the same names (and class) as the IV's on the RHS of the formula:

predict(mod1, newdata =data.frame(Location=factor(1), Year=factor(c(2010,2011)) ) )
 # returns
        1         2 
0.5333333 0.2333333 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM