[英]Irregular time series in fable package
In the tsibble package and fable package, I think I read somewhere that we can handle irregular time series.在 tsibble package 和寓言 package 中,我想我在某处读到我们可以处理不规则的时间序列。 I could not find anything with examples on how to do it.
我找不到任何关于如何做的例子。 Some questions I have are:
我的一些问题是:
Are there any questions/ links where I can see a working example?是否有任何问题/链接可以让我看到一个工作示例? eg This question uses zoo/xts to handle it .
例如,这个问题使用 zoo/xts 来处理。
I saw some capabilities related to that in zoo/xts, which is always good but I am spinning my wheels on fable and trying to get it to work.我在 zoo/xts 中看到了一些与此相关的功能,这总是很好,但我在 fable 上旋转我的轮子并试图让它工作。
for a sample dataset we can use对于我们可以使用的示例数据集
DF <- structure(list(station = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L), Time = structure(c(1L, 2L, 3L, 5L, 7L, 1L, 2L, 4L, 6L, 8L
), .Label = c("01-01-1974", "01-02-1974", "01-03-1974", "01-04-1974",
"01-05-1974", "01-06-1974", "01-07-1974", "01-08-1974"), class = "factor"),
WaterTemp = c(5, 5, 8.6000004, 8.1333332, 12.7999999, 5,
5, 8.6000004, 8.1333332, 12.7999999)), .Names = c("station",
"Time", "WaterTemp"), class = "data.frame", row.names = c(NA,
-10L))
Most models available in {fable}
require the observations to be regular, and a lot of models also require that there are no gaps in the data. {fable}
中可用的大多数模型都要求观察是有规律的,并且许多模型还要求数据中没有间隙。 An example model which supports irregular data is fable::TSLM()
.支持不规则数据的示例 model 是
fable::TSLM()
。
The above example data is typically considered 'regular' but with gaps.上面的示例数据通常被认为是“常规的”,但有差距。 This is because the data has a common interval of
1 month
, however some months are missing in the data.这是因为数据的共同间隔为
1 month
,但数据中缺少某些月份。 Here is how a tsibble for this data can be produced:以下是如何生成此数据的 tsibble:
DF <- structure(list(station = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L), Time = structure(c(1L, 2L, 3L, 5L, 7L, 1L, 2L, 4L, 6L, 8L
), .Label = c("01-01-1974", "01-02-1974", "01-03-1974", "01-04-1974",
"01-05-1974", "01-06-1974", "01-07-1974", "01-08-1974"), class = "factor"),
WaterTemp = c(5, 5, 8.6000004, 8.1333332, 12.7999999, 5,
5, 8.6000004, 8.1333332, 12.7999999)), .Names = c("station",
"Time", "WaterTemp"), class = "data.frame", row.names = c(NA,
-10L))
# Fix $Time to a valid yearmonth index variable
library(tsibble)
library(dplyr)
DF <- DF %>%
mutate(Time = yearmonth(as.Date(format(Time), format = "%d-%m-%Y")))
DF
#> station Time WaterTemp
#> 1 1 1974 Jan 5.000000
#> 2 1 1974 Feb 5.000000
#> 3 1 1974 Mar 8.600000
#> 4 1 1974 May 8.133333
#> 5 1 1974 Jul 12.800000
#> 6 2 1974 Jan 5.000000
#> 7 2 1974 Feb 5.000000
#> 8 2 1974 Apr 8.600000
#> 9 2 1974 Jun 8.133333
#> 10 2 1974 Aug 12.800000
# Create a 'regular' tsibble (with gaps)
as_tsibble(DF, key = "station", index = "Time")
#> # A tsibble: 10 x 3 [1M]
#> # Key: station [2]
#> station Time WaterTemp
#> <int> <mth> <dbl>
#> 1 1 1974 Jan 5
#> 2 1 1974 Feb 5
#> 3 1 1974 Mar 8.60
#> 4 1 1974 May 8.13
#> 5 1 1974 Jul 12.8
#> 6 2 1974 Jan 5
#> 7 2 1974 Feb 5
#> 8 2 1974 Apr 8.60
#> 9 2 1974 Jun 8.13
#> 10 2 1974 Aug 12.8
To fill in the gaps of this dataset - similarly to what is shown in the linked question - you can use the tsibble::fill_gaps()
function.要填补此数据集的空白 - 类似于链接问题中显示的内容 - 您可以使用
tsibble::fill_gaps()
function。 This makes the data compatible with models that support missing values, but don't support gaps in the data such as fable::ARIMA()
.这使得数据与支持缺失值的模型兼容,但不支持数据中的间隙,例如
fable::ARIMA()
。
# Create a 'regular' tsibble (with gaps) then complete the gaps
as_tsibble(DF, key = "station", index = "Time") %>%
fill_gaps()
#> # A tsibble: 15 x 3 [1M]
#> # Key: station [2]
#> station Time WaterTemp
#> <int> <mth> <dbl>
#> 1 1 1974 Jan 5
#> 2 1 1974 Feb 5
#> 3 1 1974 Mar 8.60
#> 4 1 1974 Apr NA
#> 5 1 1974 May 8.13
#> 6 1 1974 Jun NA
#> 7 1 1974 Jul 12.8
#> 8 2 1974 Jan 5
#> 9 2 1974 Feb 5
#> 10 2 1974 Mar NA
#> 11 2 1974 Apr 8.60
#> 12 2 1974 May NA
#> 13 2 1974 Jun 8.13
#> 14 2 1974 Jul NA
#> 15 2 1974 Aug 12.8
An irregular time series can be created using regular = FALSE
.可以使用
regular = FALSE
创建不规则的时间序列。 This is typically useful if you're working with event data.如果您正在处理事件数据,这通常很有用。 In this case you would rarely want to fill the gaps, because there are so many.
在这种情况下,您很少想填补空白,因为有很多。
# Create an 'irregular' tsibble (no concept of gaps)
as_tsibble(DF, key = "station", index = "Time", regular = FALSE)
#> # A tsibble: 10 x 3 [!]
#> # Key: station [2]
#> station Time WaterTemp
#> <int> <mth> <dbl>
#> 1 1 1974 Jan 5
#> 2 1 1974 Feb 5
#> 3 1 1974 Mar 8.60
#> 4 1 1974 May 8.13
#> 5 1 1974 Jul 12.8
#> 6 2 1974 Jan 5
#> 7 2 1974 Feb 5
#> 8 2 1974 Apr 8.60
#> 9 2 1974 Jun 8.13
#> 10 2 1974 Aug 12.8
Created on 2021-02-09 by the reprex package (v0.3.0)由代表 package (v0.3.0) 于 2021 年 2 月 9 日创建
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.