I have some data about trends over time in drug use across the state. I want to know whether there have been changes in the gender difference in intravenous drug use versus gender differences in all recreational drug use over time.
My data is below. I think I might need to use time-series analysis, but I'm not sure. Any help would be much appreciated.
Since the description in the question does not match the data as there is no information on gender we will assume from the subject that we want to determine if the trends of illicit and iv are the same.
Note that there is no autocorrelation in the detrended values of iv
or illicit
so we will use ordinary linear models.
iv <- c(0.4, 0.3, 0.4, 0.3, 0.2, 0.2)
illicit <- c(5.5, 5.7, 4.8, 4.7, 6.1, 5.3)
time <- 2011:2016
ar(resid(lm(iv ~ time)))
## Call:
## ar(x = resid(lm(iv ~ time)))
##
## Order selected 0 sigma^2 estimated as 0.0024
ar(resid(lm(illicit ~ time)))
## Call:
## ar(x = resid(lm(illicit ~ time)))
##
## Order selected 0 sigma^2 estimated as 0.287
Create a 12x3 data frame long
with columns time
, value
and ind
( iv
or illicit
). Then run a linear model with two slopes and and another with one slope. Both have two intercepts. Then compare them using anova
. Evidently they are not significantly different so we cannot reject the hypothesis that the slopes are the same.
wide <- data.frame(iv, illicit)
long <- cbind(time, stack(wide))
fm2 <- lm(values ~ ind/(time + 1) + 0, long)
fm1 <- lm(values ~ ind + time + 0, long)
anova(fm1, fm2)
giving:
Analysis of Variance Table
Model 1: values ~ ind + time + 0
Model 2: values ~ ind/(time + 1) + 0
Res.Df RSS Df Sum of Sq F Pr(>F)
1 9 1.4629
2 8 1.4469 1 0.016071 0.0889 0.7732
Actually the slopes are not significant in the first place and we cannot reject the hypothesis that both the slopes are zero. Compare to a two intercept model with no slopes.
fm0 <- lm(values ~ ind + 0, long)
anova(fm0, fm2)
giving:
Analysis of Variance Table
Model 1: values ~ ind + 0
Model 2: values ~ ind/(time + 1) + 0
Res.Df RSS Df Sum of Sq F Pr(>F)
1 10 1.4750
2 8 1.4469 2 0.028143 0.0778 0.9258
or running a stepwise regression we find that its favored model is one with two intercepts and no slopes:
step(fm2)
giving:
Start: AIC=-17.39
values ~ ind/(time + 1) + 0
Df Sum of Sq RSS AIC
- ind:time 2 0.028143 1.4750 -21.155
<none> 1.4469 -17.386
Step: AIC=-21.15
values ~ ind - 1
Df Sum of Sq RSS AIC
<none> 1.475 -21.155
- ind 2 172.28 173.750 32.073
Call:
lm(formula = values ~ ind - 1, data = long)
Coefficients:
indiv indillicit
0.30 5.35
If we use log(values) then we similarly find no autocorrelation (not shown) but we do find the slopes of the log transformed values are significantly different.
fm2log <- lm(log(values) ~ ind/(time + 1) + 0, long)
fm1log <- lm(log(values) ~ ind + time + 0, long)
anova(fm1log, fm2log)
giving:
Analysis of Variance Table
Model 1: log(values) ~ ind + time + 0
Model 2: log(values) ~ ind/(time + 1) + 0
Res.Df RSS Df Sum of Sq F Pr(>F)
1 9 0.35898
2 8 0.18275 1 0.17622 7.7141 0.02402 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.