简体   繁体   中英

Compare slopes of regression lines by interaction of covariates

There are a number of questions I've found on the topic but none quite analogous to my scenario. This is primarily a general statistics approach question, but any helpful information on how to approach this type of data in R is much appreciated!

This is a biological study where I have three independent mutations, "A, B, and C," which I use to create genotypes. For my study design, I have a dependent variable (Distance), and an independent variable (Load). which I measure in the following genotypes:

Genotypes:

Reference
A
B
C
A:B
A:C
B:C
A:B:C

ie I have the background levels of Distance given Load in my Reference genotype. What I want to test is the contribution of each individual mutation (A, B, or C), and combined mutations (AB, AC, BC, ABC), to Distance for a given Load. My plan for this was to test if the slopes of the regressions produced by different genotypes were significantly different from each other. This would allow me to determine if a loss in Distance is due to an Additive increase in load, or if certain genotypes lose Distance faster or slower given increasing Load.

I am unsure if/how I can use ANCOVA or a mixed effects model for this question.

I have a similar approach where I see how Distance varies by genotype. In that example, my data structure looks like this (csv):

Genotype, Distance, A, B, C
Reference, 15, 0, 0, 0
Reference, 16, 0, 0, 0
A, 15, 1, 0, 0
A, 16, 1, 0, 0
B, 12, 0, 1, 0
B, 11, 0, 1, 0
C, 15, 0, 0, 1
C, 15, 0, 0, 1
AB, 3, 1, 1, 0
AB, 4, 1, 1, 0
AC, 13, 1, 0, 1
AC, 14, 1, 0, 1
BC, 8, 0, 1, 1
BC, 9, 0, 1, 1
ABC, 2, 1, 1, 1
ABC, 2, 1, 1, 1

Where I measure distance for each genotype (with replicates), and use a data matrix to indicate what mutations that genotype has: So A has 1, 0, 0 to indicate it has A, but not B, and not C; AB has 1, 1, 0 to indicate it has A and B, but not C. etc...

And then I use:

 lm<-lm(Distance~A*B*C, data=data)
summary(lm)

to test the contributions of each mutation to Distance, and see if any interaction terms (eg A:B) are significant. This uses A, B, and C as continuous variables (a violation of normality, but it's an approach at least).

Much appreciated for any help, insight, or directions anyone can point me towards. Applying this lm() approach works with one dependent variable and the genotypes as the independent variable, but using the genotypes (including interactions) as covariates to a second independent variable is something I haven't seen employed in other questions.

With a fixed effects model:

fit.null <- lm(Distance ~ A*B*C + Genotype)

fit.alt<- lm(Distance ~ A*B*C * Genotype)

and test nested models with aov(fit.alt, fit.null) .

The assumption: adequate power and homoscedasticity: a simple trick, cross tabulate freqs <- table(A, B, C, Genotype) and inspect any(freqs) < 5 . Low precision means low power.

A mixed effects analogue using a test of homogeneity of variance with lme4 (Maybe user @BenBolker can comment on the apprioriateness of this model):

fit.null <- lmer(Distance ~ A*B*C + (1|Genotype))

fit.alt <- lmer(Distance ~ A*B*C + (A*B*C|Genotype))

You can't test nested models here because it's a test of variance components. Testing models like this is more a question for stats.stackexchange.com

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM