简体   繁体   中英

Comparing two curves for difference in trend

I have some data about trends over time in drug use across the state. I want to know whether there have been changes in the gender difference in intravenous drug use versus gender differences in all recreational drug use over time.

My data is below. I think I might need to use time-series analysis, but I'm not sure. Any help would be much appreciated.

enter image description here

Since the description in the question does not match the data as there is no information on gender we will assume from the subject that we want to determine if the trends of illicit and iv are the same.

Comparing Trends

Note that there is no autocorrelation in the detrended values of iv or illicit so we will use ordinary linear models.

iv <- c(0.4, 0.3, 0.4, 0.3, 0.2, 0.2)
illicit <- c(5.5, 5.7, 4.8, 4.7, 6.1, 5.3)
time <- 2011:2016

ar(resid(lm(iv ~ time)))
## Call:
## ar(x = resid(lm(iv ~ time)))
##
## Order selected 0  sigma^2 estimated as  0.0024

ar(resid(lm(illicit ~ time)))
## Call:
## ar(x = resid(lm(illicit ~ time)))
##
## Order selected 0  sigma^2 estimated as  0.287

Create a 12x3 data frame long with columns time , value and ind ( iv or illicit ). Then run a linear model with two slopes and and another with one slope. Both have two intercepts. Then compare them using anova . Evidently they are not significantly different so we cannot reject the hypothesis that the slopes are the same.

wide <- data.frame(iv, illicit)
long <- cbind(time, stack(wide))

fm2 <- lm(values ~ ind/(time + 1) + 0, long)
fm1 <- lm(values ~ ind + time + 0, long)
anova(fm1, fm2)

giving:

Analysis of Variance Table

Model 1: values ~ ind + time + 0
Model 2: values ~ ind/(time + 1) + 0
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1      9 1.4629                           
2      8 1.4469  1  0.016071 0.0889 0.7732

Comparing model with slopes to one without slopes

Actually the slopes are not significant in the first place and we cannot reject the hypothesis that both the slopes are zero. Compare to a two intercept model with no slopes.

fm0 <- lm(values ~ ind + 0, long)
anova(fm0, fm2)

giving:

Analysis of Variance Table

Model 1: values ~ ind + 0
Model 2: values ~ ind/(time + 1) + 0
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1     10 1.4750                           
2      8 1.4469  2  0.028143 0.0778 0.9258

or running a stepwise regression we find that its favored model is one with two intercepts and no slopes:

step(fm2)

giving:

Start:  AIC=-17.39
values ~ ind/(time + 1) + 0

           Df Sum of Sq    RSS     AIC
- ind:time  2  0.028143 1.4750 -21.155
<none>                  1.4469 -17.386

Step:  AIC=-21.15
values ~ ind - 1

       Df Sum of Sq     RSS     AIC
<none>                1.475 -21.155
- ind   2    172.28 173.750  32.073

Call:
lm(formula = values ~ ind - 1, data = long)

Coefficients:
     indiv  indillicit  
      0.30        5.35  

log transformed values

If we use log(values) then we similarly find no autocorrelation (not shown) but we do find the slopes of the log transformed values are significantly different.

fm2log <- lm(log(values) ~ ind/(time + 1) + 0, long)
fm1log <- lm(log(values) ~ ind + time + 0, long)

anova(fm1log, fm2log)

giving:

Analysis of Variance Table

Model 1: log(values) ~ ind + time + 0
Model 2: log(values) ~ ind/(time + 1) + 0
  Res.Df     RSS Df Sum of Sq      F  Pr(>F)  
1      9 0.35898                              
2      8 0.18275  1   0.17622 7.7141 0.02402 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM