简体   繁体   English

R中的三因子嵌套方差分析

[英]3 Factor Nested ANOVA in R

I am trying to replicate a 3 Factor nested ANOVA anlaysis in a paper: Underwood, AJ (1993) The Mechanics of spatially replicated sampling programmes to detect environmental impacts in a variable world. 我正在尝试在论文中复制3因子嵌套ANOVA分析:Underwood,AJ(1993)空间复制采样程序的机制以检测可变世界中的环境影响。

The data for the example (from Table 3, Underwood 1993) can be produced by: 该示例的数据(来自Underwood 1993,表3)可以通过以下方式生成:

dat <-
structure(list(B = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), .Label = c("A", "B"), class = "factor"), C = structure(c(2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("C", "I"), class = "factor"),
    Times = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
    3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
    4L, 4L), .Label = c("1", "2", "3", "4"), class = "factor"),
    Locations = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L,
    1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
    3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L,
    2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L,
    1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
    3L), X = c(59L, 51L, 45L, 46L, 40L, 32L, 39L, 32L, 25L, 51L,
    44L, 37L, 55L, 47L, 41L, 31L, 38L, 45L, 41L, 47L, 55L, 43L,
    36L, 29L, 23L, 30L, 37L, 57L, 50L, 43L, 36L, 44L, 51L, 39L,
    29L, 23L, 38L, 44L, 52L, 31L, 38L, 45L, 42L, 35L, 28L, 52L,
    44L, 37L, 51L, 43L, 37L, 38L, 31L, 24L, 60L, 52L, 46L, 30L,
    37L, 44L, 41L, 34L, 27L, 53L, 46L, 39L, 40L, 34L, 26L, 21L,
    27L, 35L), Times.unique = structure(c(5L, 5L, 5L, 5L, 5L,
    5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L,
    7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
    8L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
    2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L,
    4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("A_1", "A_2", "A_3",
    "A_4", "B_1", "B_2", "B_3", "B_4"), class = "factor")), .Names = c("B",
"C", "Times", "Locations", "Y", "Times.unique"), row.names = c(NA,
-72L), class = "data.frame")

dat

The data frame dat has 4 factors: 数据帧数据有四个因素:

B - has two levels "A" and "B" (before v after) B-具有两个级别“ A”和“ B”(在v之前)

Times - 8 levels, 4 within before "B" and 4 within after "A", coded as 1:4 within each. 时间-8个级别,在“ B”之前为4个级别,在“ A”之后为4个级别,每个级别均编码为1:4。 note that variable Times.unique is the same thing but with a unique code for each time (before and after) 请注意,变量Times.unique是同一件事,但是每次(前后)都具有唯一的代码

Locations - has three levels, all measured every time both before and after 位置-具有三个级别,所有级别都在之前和之后的每次测量

C - has two levels control (C) and (I). C-具有两个级别的控件(C)和(I)。 note: two locations are control and one is impact 注意:两个位置是控制权,一个是影响力

While I am clear on how to analyse such a design using mixed models (lmer), I would like to replicate his example exactly so that I can run some simulations to compare his method. 尽管我清楚如何使用混合模型(lmer)来分析这种设计,但我想精确地复制他的示例,以便我可以运行一些模拟来比较他的方法。

In particular I am attempting to replicate the SS values presented in table 4 under column "a". 特别是,我尝试复制表4中列“ a”下显示的SS值。 He fits a design that has SS and df values for the following terms: 他适合具有以下术语的SS和df值的设计:

B -> SS = 66.13, df = 1 B-> SS = 66.13,df = 1

Times(B) -> SS = 280.64, df = 6 时间(B)-> SS = 280.64,df = 6

Locations -> SS = 283.86, df = 2 位置-> SS = 283.86,df = 2

B x Locations -> SS = 29.26, df = 2 B x位置-> SS = 29.26,df = 2

Times(B) x Locations-> SS = 575.45, df = 12 次(B)x位置-> SS = 575.45,df = 12

Residual -> SS = 2420.00, df = 48 残留-> SS = 2420.00,df = 48

Total -> SS = 6208.34, df = 71 总计-> SS = 6208.34,df = 71

I assume the Times(B) term represents Times nested within the Before/After treatment "B". 我假定Times(B)项表示嵌套在“ B”之前/之后的时间。 For this example he ignores that Locations are from control and impact treatments and leaves out factor C altogether. 对于此示例,他忽略了“位置”来自控制和影响处理,并且完全省略了因素C。

I have tried all possible combinations I can think of to reproduce this nested anova, using both unique Times coding and Times coded as 1:4 within B (before and after). 我尝试了所有可能想到的组合,以使用唯一的Times编码和在B中(之前和之后)以1:4编码的Times来重现此嵌套的方差分析。 I have tried using %in%, / and Error() arguments, as well as Anova from car to change the type of SS calculated. 我尝试使用%in%,/和Error()参数,以及从汽车中使用Anova来更改计算的SS类型。 Examples of the %in% and / nested fits include: %in%和/嵌套拟合的示例包括:

aov(Y~B+Locations+Times%in%B+B:Locations+Times%in%B:Locations, data=dat)
aov(Y~B+Locations+B/Times+B:Locations+B/Times:Locations, data=dat)

I seem to be unable to replicate Underwood's SS values exactly, particularly for the two interaction terms. 我似乎无法准确地复制Underwood的SS值,尤其是对于两个交互术语而言。 A friend let me fit the model in statistix, where the SS values can be reproduced exactly, so it is possible to obtain the above SS values for this model. 一位朋友让我在statistix中拟合该模型,可以精确地复制SS值,因此可以为该模型获得上述SS值。

Can anyone help me fit this model in R? 谁能帮助我在R中拟合此模型? I wish to embed it in a larger simulation and really need to be able to run the model in R, such that the Underwood 1993 SS values are reproduced exactly? 我希望将其嵌入更大的模拟中,并且确实需要能够在R中运行该模型,以便准确再现Underwood 1993 SS的值?

Your problem is that dat$Locations is an integer, when it should be a factor (three unique locations). 您的问题是dat$Locations应该是一个整数(三个唯一位置)时,它是一个整数。 One hint is that your ANOVA line thinks Locations takes up only 1 df, while Underwood gives it 2. 一个提示是,您的ANOVA行认为“位置”仅占1 df,而Underwood则占2。

Simply add the line: 只需添加以下行:

dat$Locations = factor(dat$Locations)

And then your line of code reproduces the Underwood results perfectly: 然后,您的代码行完美地再现了Underwood的结果:

aov(Y~B+Locations+B/Times+B:Locations+B/Times:Locations, data=dat)
#Call:
#   aov(formula = Y ~ B + Locations + B/Times + B:Locations + B/Times:Locations, 
#    data = dat)
#
#Terms:
#                        B Locations   B:Times B:Locations B:Locations:Times
#Sum of Squares    66.1250 2836.8611  280.6389     29.2500          575.4444
#Deg. of Freedom         1         2         6           2                12
#                Residuals
#Sum of Squares  2420.0000
#Deg. of Freedom        48

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM