简体   繁体   中英

Means of different groups in ANCOVA with 2 factors R

I am currently studying the influence of different traits on the shell volume of a snail. I have a dataframe, where each line represents a given individual, and several columns with all its attributes (length, shell volume, sex, infection).

I made the ANCOVA: mod=aov(log(volume) ~ infection*sex*log(length)) . I got this:

         Df Sum Sq Mean Sq F value Pr(>F)    
    inf                   1  4.896   4.896 258.126 <2e-16 ***
    sex                   1  3.653   3.653 192.564 <2e-16 ***
    log(length)           1 14.556  14.556 767.335 <2e-16 ***
    inf:sex               1  0.028   0.028   1.472  0.227    
    inf:log(length)       1  0.020   0.020   1.064  0.304    
    sex:log(length)       1  0.001   0.001   0.076  0.783    
    inf:sex:log(length)   1  0.010   0.010   0.522  0.471    
    Residuals           174  3.301   0.019                   

So significant effects of sex, infection and length, but no interaction terms.

Since there are no interactions, I would like to know, for a given sex, whether the intercept of log(volume) = f(log(length)) is bigger for infected individuals or uninfected individuals.
I tried to use summary.lm(mod) , which gave me this:

                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)             -0.42806    0.15429  -2.774  0.00613 ** 
infmic                  -0.54963    0.40895  -1.344  0.18070    
sexM                    -0.11542    0.35508  -0.325  0.74554    
log(length)              2.41915    0.11144  21.709  < 2e-16 ***
infmic:sexM              0.52459    0.63956   0.820  0.41320    
infmic:log(length)       0.43215    0.33717   1.282  0.20166    
sexM:log(length)         0.04207    0.28113   0.150  0.88122    
infmic:sexM:log(length) -0.38222    0.52920  -0.722  0.47110    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1377 on 174 degrees of freedom
Multiple R-squared:  0.8753,    Adjusted R-squared:  0.8703 
F-statistic: 174.5 on 7 and 174 DF,  p-value: < 2.2e-16

But I have trouble interpreting the results, and still don't see how to conclude. I also have "few" other questions:

Why aren't sex and infection significant in the lm output? I know it is not significant here,but how to interpret the lines about the interaction terms?

What I think is that infmic:sexM represents the change in the slope of log(volume)=f(log(length)) for infected males compared with uninfected females. Then, would infmic:length be the change of slope between infected females and uninfected females? And sexM:length the change between uninfected males and uninfected females? Is this true? And what does the triple interaction term represent?

Thanks a lot!

EDIT: I found part of the answer.

Let's split the data in 4 groups (F-NI, FI, M-NI, MI), and look for the equation of the regression line log(volume) = f(log(length)) for each of these groups. Here, the coefficients are the ones given by the function summary.lm(mod)

The equations are:

  • For non-infected females: log(volume) = (Intercept) + log(length)
  • For infected females: log(volume) = (Intercept) + infmic + log(length) + infmic:log(length)
  • For non-infected males: log(volume) = (Intercept) + sexM + log(length) + sexM:log(length)
  • For infected males: log(volume) = (Intercept) + infmic + sexM + infmic:sexM + log(length) + infmic:log(length) + sexM:log(length) + infmic:sexM:log(length)

For each equation, the slope is the part that starts with log(length) , and the intercept is the part before.

It might be obvious for some of you, but I really didn't understand what each coefficient represented at first, so I prefer to put it here!

Alice

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM