简体   繁体   English

单向重复测量具有不平衡数据的ANOVA

[英]One-way repeated measures ANOVA with unbalanced data

I'm new to R, and I've read these forums (for help with R) for awhile now, but this is my first time posting. 我是R的新手,我现在已经阅读了这些论坛(以获得R的帮助),但这是我第一次发帖。 After googling each error here, I still can't figure out and fix my mistakes. 在谷歌搜索每个错误后,我仍然无法弄清楚并修复我的错误。

I am trying to run a one-way repeated measures ANOVA with unequal sample sizes. 我试图运行具有不相等样本大小的单向重复测量方差分析。 Here is a toy version of my data and the code that I'm using. 这是我的数据的玩具版本和我正在使用的代码。 (If it matters, my real data have 12 bins with up to 14 to 20 values in each bin.) (如果重要的话,我的真实数据有12个箱子,每个箱子里有多达14到20个值。)

## the data: average probability for a subject, given reaction time bin
bin1=c(0.37,0.00,0.00,0.16,0.00,0.00,0.08,0.06)
bin2=c(0.33,0.21,0.000,1.00,0.00,0.00,0.00,0.00,0.09,0.10,0.04)
bin3=c(0.07,0.41,0.07,0.00,0.10,0.00,0.30,0.25,0.08,0.15,0.32,0.18)

## creating the data frame

# dependent variable column
probability=c(bin1,bin2,bin3)

# condition column
bin=c(rep("bin1",8),rep("bin2",11),rep("bin3",12))

# subject column (in the order that will match them up with their respective
# values in the dependent variable column)
subject=c("S2","S3","S5","S7","S8","S9","S11","S12","S1","S2","S3","S4","S7",
  "S9","S10","S11","S12","S13","S14","S1","S2","S3","S5","S7","S8","S9","S10",
  "S11","S12","S13","S14")

# putting together the data frame
dataFrame=data.frame(cbind(probability,bin,subject))

## one-way repeated measures anova
test=aov(probability~bin+Error(subject/bin),data=dataFrame)

These are the errors I get: 这些是我得到的错误:

Error in qr.qty(qr.e, resp) : 
  invalid to change the storage mode of a factor
In addition: Warning messages:
1: In model.response(mf, "numeric") :
  using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : - not meaningful for factors
3: In aov(probability ~ bin + Error(subject/bin), data = dataFrame) :
  Error() model is singular

Sorry for the complexity (assuming it is complex; it is to me). 对不起复杂性(假设它很复杂;对我而言)。 Thank you for your time. 感谢您的时间。

For an unbalanced repeated-measures design, it might be easiest to use lme (from the nlme package): 对于不平衡的重复测量设计,最简单的方法是使用lme (来自nlme包):

## this should be the same as the data you constructed above, just 
## a slightly more compact way to do it.
datList <- list(
   bin1=c(0.37,0.00,0.00,0.16,0.00,0.00,0.08,0.06),
   bin2=c(0.33,0.21,0.000,1.00,0.00,0.00,0.00,0.00,0.09,0.10,0.04),
   bin3=c(0.07,0.41,0.07,0.00,0.10,0.00,0.30,0.25,0.08,0.15,0.32,0.18))
subject=c("S2","S3","S5","S7","S8","S9","S11","S12",
          "S1","S2","S3","S4","S7","S9","S10","S11","S12","S13","S14",
          "S1","S2","S3","S5","S7","S8","S9","S10","S11","S12","S13","S14")
d <- data.frame(probability=do.call(c,datList),
                bin=paste0("bin",rep(1:3,sapply(datList,length))),
                subject)

library(nlme)
m1 <- lme(probability~bin,random=~1|subject/bin,data=d)
summary(m1)

The only real problem is that some aspects of the interpretation etc. are pretty far from the classical sum-of-squares-decomposition approach (eg it's fairly tricky to do significance tests of variance components). 唯一真正的问题是解释等的某些方面与经典的平方和分解方法相差甚远(例如,对方差分量进行显着性检验相当棘手)。 Pinheiro and Bates (Springer, 2000) is highly recommended reading if you're going to head in this direction. Pinheiro和Bates(Springer,2000)强烈推荐阅读,如果你要朝着这个方向前进的话。

It might be a good idea to simulate/make up some balanced data and do the analysis with both aov() and lme() , look at the output, and make sure you can see where the correspondences are/know what's going on. 模拟/构建一些平衡数据并使用aov()lme() ,查看输出,并确保可以看到对应关系的位置/知道发生了什么,这可能是一个好主意。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM