简体   繁体   中英

Time series analysis in R (figuring out trends)

I have a dataset of tweets from 2013 to 2017. I have coded for certain message features(coded 0 as absence and 1 as presence), and was trying to figure out if there is a trend (ie, the occurrence of message feature going up/down gradually) in my dataset. How should I do it in R?

You could try a linear model, such as one of the answer given in here .

#for reproducing
set.seed(200)
library(ggplot2)
#simple example. Assume your data is simple binomial variable with probability 0.3
data <- data.frame(time = 1:200, val=sample(c(0,1), size = 200, replace = T, prob = c(0.3, 0.7)))

#plot using ggplot and add linear regression and confidence interval
ggplot(data, aes(x = time, y=val)) + geom_smooth(method=lm) +geom_point()

#Now we can try to create linear regression
y = data$time
    x = data$val
fitData <- lm(x ~ y)
predict(fitData, newdata = data.frame(y=201:224), interval="confidence")

#You can also take advantage of geom_smooth that find the best model if your don't specify any:
ggplot(data, aes(x = time, y=val)) + geom_smooth() +geom_point()

#Here, it seems that loess would be better

Some code to do a loess regression in R here .

Good luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM