I am doing some cyclical analysis.
I have Variable X, which if true if in the state of contraction, and false otherwise
X
##[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
....
which I changed into 0's and 1's by
X2<-as.ts(X*1)
Then I have a date sequence.
td
## [1] "2000-01-31" "2000-02-29" "2000-03-31" "2000-04-30" "2000-05-31" "2000-06-30"
....
which i then used 'zoo' to index X2
with order td.
library(zoo)
na_ts = zoo(x=X2, order.by=td)
Now is my question. I would want to identify the dates when the value changes, and count how long the series has stayed as 1 and 0.
So desired outcome:
start end type duration
2000-01-31 - 2001-05-31 contraction 17 months
2001-06-30 - 2004-05-31 expansion ....
Would anybody help me please? Many thanks in advance.
You can use the run-length encoding of X
to split up the time series into consecutive elements with the same value:
# Reproducible example
X <- c(F, F, F, T, T, F)
td <- c( "2000-01-31", "2000-02-29", "2000-03-31", "2000-04-30", "2000-05-31", "2000-06-30")
library(zoo)
na_ts = zoo(x=X, order.by=td)
# Split with run-length encoding
runlens <- rle(X)
(ts.spl <- split(na_ts, rep(seq_along(runlens$lengths), times=runlens$lengths)))
# $`1`
# 2000-01-31 2000-02-29 2000-03-31
# FALSE FALSE FALSE
#
# $`2`
# 2000-04-30 2000-05-31
# TRUE TRUE
#
# $`3`
# 2000-06-30
# FALSE
Now you can extract whatever information you want from each time series stored in the list ts.spl
. For instance:
dat <- data.frame(start = sapply(ts.spl, start),
end = sapply(ts.spl, end),
val = ifelse(runlens$values, "contraction", "expansion"))
dat$days <- as.numeric(as.Date(dat$end) - as.Date(dat$start), units="days")
dat
# start end val days
# 1 2000-01-31 2000-03-31 expansion 60
# 2 2000-04-30 2000-05-31 contraction 31
# 3 2000-06-30 2000-06-30 expansion 0
This approach is an example of split-apply-combine, where we split our original data based on some property of the data, applied a function to extract information of interest about each piece, and then combined it back together.
Here is the code after my slight modification. Thanks josilber! We usually work on monthly data in cyclical analysis, because dating up to days wouldn't be accurate. Also the economy can either be in recession/expansion, so there wouldn't be a zero.
na_ts = zoo(x=X, order.by=td)
# Split with run-length encoding
runlens <- rle(X)
(ts.spl <- split(na_ts, rep(seq_along(runlens$lengths), times=runlens$lengths)))
dat <- data.frame(start = sapply(ts.spl, start),
end = sapply(ts.spl, end),
val = ifelse(runlens$values, "contraction", "expansion"))
dat$months<- runlens$lengths
dat
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.