简体   繁体   中英

repeated measure anova in longitudinal study

I have a data set like the following:

Groups  Score1  Score2  Score3
G1      12      19      11
G1      8       2       12
G1      5       4       17
G1      20      17      5
G1      15      3       18
G1      5       9       6
G1      14      13      16
G1      2       7       2
G1      14      1       0
G1      9       19      11
G2      8       11      9
G2      14      7       17
G2      16      10      18
G2      13      9       14
G2      10      15      15
G2      5       1       11
G2      4       16      19
G2      17      14      16
G2      14      13      16
G2      2       0       13
G3      16      13      19
G3      3       12      10
G3      9       4       16
G3      17      3       12
G3      18      4       6
G3      20      1       18
G3      15      17      7
G3      10      16      12
G3      3       12      2
G3      8       2       2

My goal is to compare the three scores within each group, and see if the mean of score1 for group1 is significantly different from score2 and score3. And also to compare the means of score1 between each group. And map all the three scores (three lines) on the horizontal axis of the grouping factor on a nice graph. I am stuck with which R package I should do it. Could somebody please let me know which package and function best does this? thanks

Something like this?

library(reshape2)    # for melt(...)
library(ggplot2)
df.melt <- melt(df, id="Groups", variable.name="Score")
ggplot(df.melt, aes(x=Groups, y=value, color=Score))+
  stat_summary(geom="point", fun.y=mean, position=position_dodge(width=0.5))+
  stat_summary(geom="errorbar", fun.data=mean_cl_normal, width=0.1, position=position_dodge(width=0.5))+
  labs(x="", y="Score")

So here, we first convert your dataset from "wide" format (scores in different columns) to "long" format (all the scores in one column, with a second column, Score , indicating which set each row belongs to). The we use ggplot to plot the mean score (using stat_summary(fun.y=mean,...) and the +/- 95% CL (using stat_summary(fun.data=mean_cl_normal,...) . The rest is just formatting.

You would think from this that, since the 95% CL overlap for every group and every score, that no score/group is different from any other score/group. But this is misleading. If we run a t-test comparing scores 2 and 3 in group 2, for instance,

with(df[df$Groups=="G2",],t.test(Score2,Score3))
#   Welch Two Sample t-test
# 
# data:  Score2 and Score3
# t = -2.5857, df = 14.184, p-value = 0.0214
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
#  -9.5080934 -0.8919066
# sample estimates:
# mean of x mean of y 
#       9.6      14.8 

we can see that these two scores are different at approximately the 98% level.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM