简体   繁体   English

寓言package中的不规则时间序列

[英]Irregular time series in fable package

In the tsibble package and fable package, I think I read somewhere that we can handle irregular time series.在 tsibble package 和寓言 package 中,我想我在某处读到我们可以处理不规则的时间序列。 I could not find anything with examples on how to do it.我找不到任何关于如何做的例子。 Some questions I have are:我的一些问题是:

  1. Do I have to convert irregular timeseries to a regular one before I can model?在 model 之前,我是否必须将不规则时间序列转换为规则时间序列? (So far what I know is that we need to convert irregular time series to a regular one. Please let me know if its is not the case? and if not then what are some models that do not need regular time series?) (到目前为止,我所知道的是我们需要将不规则时间序列转换为规则时间序列。如果不是这样,请告诉我?如果不是,那么有哪些模型不需要规则时间序列?)
  2. What are the tools and models in tidyverts/tsibble/ fable /fabletools to handle irregular timeseries? tidyverts/tsibble/fable/fabletools 中处理不规则时间序列的工具和模型有哪些?

Are there any questions/ links where I can see a working example?是否有任何问题/链接可以让我看到一个工作示例? eg This question uses zoo/xts to handle it .例如,这个问题使用 zoo/xts 来处理

I saw some capabilities related to that in zoo/xts, which is always good but I am spinning my wheels on fable and trying to get it to work.我在 zoo/xts 中看到了一些与此相关的功能,这总是很好,但我在 fable 上旋转我的轮子并试图让它工作。

for a sample dataset we can use对于我们可以使用的示例数据集

    DF <- structure(list(station = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L), Time = structure(c(1L, 2L, 3L, 5L, 7L, 1L, 2L, 4L, 6L, 8L
), .Label = c("01-01-1974", "01-02-1974", "01-03-1974", "01-04-1974", 
"01-05-1974", "01-06-1974", "01-07-1974", "01-08-1974"), class = "factor"), 
    WaterTemp = c(5, 5, 8.6000004, 8.1333332, 12.7999999, 5, 
    5, 8.6000004, 8.1333332, 12.7999999)), .Names = c("station", 
"Time", "WaterTemp"), class = "data.frame", row.names = c(NA, 
-10L))

Most models available in {fable} require the observations to be regular, and a lot of models also require that there are no gaps in the data. {fable}中可用的大多数模型都要求观察是有规律的,并且许多模型还要求数据中没有间隙。 An example model which supports irregular data is fable::TSLM() .支持不规则数据的示例 model 是fable::TSLM()

The above example data is typically considered 'regular' but with gaps.上面的示例数据通常被认为是“常规的”,但有差距。 This is because the data has a common interval of 1 month , however some months are missing in the data.这是因为数据的共同间隔为1 month ,但数据中缺少某些月份。 Here is how a tsibble for this data can be produced:以下是如何生成此数据的 tsibble:

DF <- structure(list(station = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                                 2L), Time = structure(c(1L, 2L, 3L, 5L, 7L, 1L, 2L, 4L, 6L, 8L
                                 ), .Label = c("01-01-1974", "01-02-1974", "01-03-1974", "01-04-1974", 
                                               "01-05-1974", "01-06-1974", "01-07-1974", "01-08-1974"), class = "factor"), 
                     WaterTemp = c(5, 5, 8.6000004, 8.1333332, 12.7999999, 5, 
                                   5, 8.6000004, 8.1333332, 12.7999999)), .Names = c("station", 
                                                                                     "Time", "WaterTemp"), class = "data.frame", row.names = c(NA, 
                                                                                                                                               -10L))

# Fix $Time to a valid yearmonth index variable
library(tsibble)
library(dplyr)
DF <- DF %>% 
  mutate(Time = yearmonth(as.Date(format(Time), format = "%d-%m-%Y")))
DF
#>    station     Time WaterTemp
#> 1        1 1974 Jan  5.000000
#> 2        1 1974 Feb  5.000000
#> 3        1 1974 Mar  8.600000
#> 4        1 1974 May  8.133333
#> 5        1 1974 Jul 12.800000
#> 6        2 1974 Jan  5.000000
#> 7        2 1974 Feb  5.000000
#> 8        2 1974 Apr  8.600000
#> 9        2 1974 Jun  8.133333
#> 10       2 1974 Aug 12.800000

# Create a 'regular' tsibble (with gaps)
as_tsibble(DF, key = "station", index = "Time")
#> # A tsibble: 10 x 3 [1M]
#> # Key:       station [2]
#>    station     Time WaterTemp
#>      <int>    <mth>     <dbl>
#>  1       1 1974 Jan      5   
#>  2       1 1974 Feb      5   
#>  3       1 1974 Mar      8.60
#>  4       1 1974 May      8.13
#>  5       1 1974 Jul     12.8 
#>  6       2 1974 Jan      5   
#>  7       2 1974 Feb      5   
#>  8       2 1974 Apr      8.60
#>  9       2 1974 Jun      8.13
#> 10       2 1974 Aug     12.8

To fill in the gaps of this dataset - similarly to what is shown in the linked question - you can use the tsibble::fill_gaps() function.要填补此数据集的空白 - 类似于链接问题中显示的内容 - 您可以使用tsibble::fill_gaps() function。 This makes the data compatible with models that support missing values, but don't support gaps in the data such as fable::ARIMA() .这使得数据与支持缺失值的模型兼容,但不支持数据中的间隙,例如fable::ARIMA()

# Create a 'regular' tsibble (with gaps) then complete the gaps
as_tsibble(DF, key = "station", index = "Time") %>% 
  fill_gaps()
#> # A tsibble: 15 x 3 [1M]
#> # Key:       station [2]
#>    station     Time WaterTemp
#>      <int>    <mth>     <dbl>
#>  1       1 1974 Jan      5   
#>  2       1 1974 Feb      5   
#>  3       1 1974 Mar      8.60
#>  4       1 1974 Apr     NA   
#>  5       1 1974 May      8.13
#>  6       1 1974 Jun     NA   
#>  7       1 1974 Jul     12.8 
#>  8       2 1974 Jan      5   
#>  9       2 1974 Feb      5   
#> 10       2 1974 Mar     NA   
#> 11       2 1974 Apr      8.60
#> 12       2 1974 May     NA   
#> 13       2 1974 Jun      8.13
#> 14       2 1974 Jul     NA   
#> 15       2 1974 Aug     12.8

An irregular time series can be created using regular = FALSE .可以使用regular = FALSE创建不规则的时间序列。 This is typically useful if you're working with event data.如果您正在处理事件数据,这通常很有用。 In this case you would rarely want to fill the gaps, because there are so many.在这种情况下,您很少想填补空白,因为有很多。

# Create an 'irregular' tsibble (no concept of gaps)
as_tsibble(DF, key = "station", index = "Time", regular = FALSE)
#> # A tsibble: 10 x 3 [!]
#> # Key:       station [2]
#>    station     Time WaterTemp
#>      <int>    <mth>     <dbl>
#>  1       1 1974 Jan      5   
#>  2       1 1974 Feb      5   
#>  3       1 1974 Mar      8.60
#>  4       1 1974 May      8.13
#>  5       1 1974 Jul     12.8 
#>  6       2 1974 Jan      5   
#>  7       2 1974 Feb      5   
#>  8       2 1974 Apr      8.60
#>  9       2 1974 Jun      8.13
#> 10       2 1974 Aug     12.8

Created on 2021-02-09 by the reprex package (v0.3.0)代表 package (v0.3.0) 于 2021 年 2 月 9 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM