简体   繁体   English

R 中的时间序列和 stl:仅允许错误的单变量序列

[英]Time series and stl in R: Error only univariate series are allowed

I am doing analysis on hourly precipitation on a file that is disorganized.我正在对杂乱无章的文件进行每小时降水分析。 However, I managed to clean it up and store it in a dataframe (called CA1) which takes the form as followed:但是,我设法清理它并将其存储在一个数据帧(称为 CA1)中,其形式如下:

  Station_ID Guage_Type   Lat   Long       Date Time_Zone Time_Frame H0 H1 H2 H3 H4 H5        H6        H7        H8        H9       H10       H11 H12 H13 H14 H15 H16 H17 H18 H19 H20 H21 H22 H23
1    4457700         HI 41.52 124.03 1948-07-01         8        LST  0  0  0  0  0  0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000   0   0   0   0  0  0   0   0   0   0   0   0
2    4457700         HI 41.52 124.03 1948-07-05         8        LST  0  1  1  1  1  1  2.0000000 2.0000000 2.0000000 4.0000000 5.0000000 5.0000000   4   7   1   1   0 0  10  13   5   1   1   3
3    4457700         HI 41.52 124.03 1948-07-06         8        LST  1  1  1  0  1  1 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000   0   0   0   0   0  0   0   0   0   0   0   0
4    4457700         HI 41.52 124.03 1948-07-27         8        LST  3  0  0  0  0  0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000   0   0   0   0   0 0   0   0   0   0   0   0
5    4457700         HI 41.52 124.03 1948-08-01         8        LST  0  0  0  0  0  0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000   0   0   0   0   0 0   0   0   0   0   0   0
6    4457700         HI 41.52 124.03 1948-08-17         8        LST  0  0  0  0  0  0 0.3888889 0.3888889 0.3888889 0.3888889 0.3888889 0.3888889   6   1   0   0   0 0   0   0   0   0   0   0

Where H0 through H23 represent the 24 hours per day (row)其中 H0 到 H23 代表每天 24 小时(行)

Using only CA1 (the dataframe above), I take each day (row) of 24 points and transpose it vertically and concatenate the remaining days (rows) to one variable, which I call dat1:仅使用 CA1(上面的数据框),我每天(行)取 24 个点并将其垂直转置并将剩余的天数(行)连接到一个变量,我称之为 dat1:

 > dat1[1:48,]
  H0  H1  H2  H3  H4  H5  H6  H7  H8  H9 H10 H11 H12 H13 H14 H15 H16 H17 H18 H19 H20 H21 H22 H23  H0  H1  H2  H3  H4  H5  H6  H7  H8  H9 H10 H11 H12 H13 H14 H15 H16 H17 H18 H19 H20 H21 H22 H23 
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   1   1   1   1   2   2   2   4   5   5   4   7   1   1   0  0  10  13   5   1   1   3 

Using the variable dat1, I input it as an argument to get a time series data:使用变量 dat1,我将其作为参数输入以获取时间序列数据:

> rainCA1 <- ts(dat1, start = c(1900+as.POSIXlt(CA1[1,5])$year, 1+as.POSIXlt(CA1[1,5])$mon), 
    frequency = 24)

A few things to note:需要注意的几点:

>dim(CA1)
  [1] 5636   31
>length(dat1)
  [1] 135264

Thus 5636*24 (total data points [24] per row) = 135264 total points.因此 5636*24(每行总数据点 [24])= 135264 个总点。 The length(rainCA1) agrees with the points above.长度(rainCA1)与上面的点一致。 However, if I put an end in the ts function, such as但是,如果我在 ts 函数中结束,例如

>rainCA1 <- ts(dat1, start = c(1900+as.POSIXlt(CA1[1,5])$year, 1+as.POSIXlt(CA1[1,5])$mon), 
    end = c(1900+as.POSIXlt(CA1[5636,5])$year, 1+as.POSIXlt(CA1[5636,5])$mon),
    frequency = 24)

I get 1134 total length of points, where I am missing a lot of data.我得到 1134 个点的总长度,其中我缺少很多数据。 I am assuming this is due to the dates not being consecutive and since I am only apply the month and year as argument for the starting point.我假设这是由于日期不是连续的,并且因为我只应用月份和年份作为起点的参数。

Continuing, in what I think is the correct path, using the first ts calculation without the end argument, I supply it as an input for stl:继续,在我认为是正确的路径中,使用没有结束参数的第一个 ts 计算,我将它作为 stl 的输入提供:

>rainCA1_2 <-stl(rainCA1, "periodic")

Unfortunately, I get an error:不幸的是,我收到一个错误:

Error in stl(rainCA1, "periodic") : only univariate series are allowed

Which I don't understand or how to go about it.我不明白或如何去做。 However, if I return to the ts function and provide the end argument, stl works fine without any errors.但是,如果我返回到 ts 函数并提供 end 参数,则 stl 可以正常工作而不会出现任何错误。

I have researched in a lot of forums, but no one (or to my understanding) provides a well solution to obtain the data attributes of hourly data.我在很多论坛上研究过,但没有人(或据我所知)提供一个很好的解决方案来获取小时数据的数据属性。 If anyone could help me, I will highly appreciate it.如果有人可以帮助我,我将不胜感激。 Thank you!谢谢!

That error is a result of the shape of your data.该错误是数据形状的结果。 Try > dim(rainCA1) ;尝试> dim(rainCA1) I suspect it to give something like > [1] 135264 1 .我怀疑它会给出类似> [1] 135264 1 Replace rainCA1 <- ts(dat1 ... by rainCA1 <- ts(dat1[[1]] ... , and it should work.替换rainCA1 <- ts(dat1 ... by rainCA1 <- ts(dat1[[1]] ... ,它应该可以工作。

Whether it does so correctly, I wonder... It seems to me your first order of business is to get your data of a consistent format.我想知道它是否正确执行...在我看来,您的首要任务是获取格式一致的数据。 Make sure ts() gets the right input.确保ts()获得正确的输入。 Check out the precise specification of ts .查看ts的精确规范。

ts() does not interpret date-time formats. ts()不解释日期时间格式。 ts() requires consecutive data points with a fixed interval. ts()需要具有固定间隔的连续数据点。 It uses a major counter and a minor counter (of which frequency fit into one major counter).它使用一个主要计数器和一个次要计数器(其frequency适合一个主要计数器)。 For instance, if your data is hourly and you expect seasonality on the daily level, frequency equals 24. start and end , therefore, are primarily cosmetic: start merely indicates t(0) for the major counter, whereas end signifies t(end).例如,如果您的数据是每小时数据并且您期望在每日级别具有季节性,则frequency等于 24。因此, startend主要是装饰性的: start仅表示主计数器的 t(0),而end表示 t(end) .

I tried to explain the write way with a very easy example to avoid these kind of errors in another question, linked here:我试图用一个非常简单的例子来解释写入方式,以避免在另一个问题中出现此类错误,链接在这里:

stl() decomposition won't accept univariate ts object? stl() 分解不接受单变量 ts 对象?

If you apply dim() in co2 or AirPassengers it will return NULL.如果您在 co2 或 AirPassengers 中应用dim()它将返回 NULL。 Thus, I suggest you to apply dim(rainCA1)<-NULL因此,我建议你应用dim(rainCA1)<-NULL

It worked for me many times.它对我有用很多次。

我发现的一个解决方案是time_series_var <- ts(data[, c("var_of_interest")])然后time_series_var <- ts(as.vector(time_series_var))然后与单变量相关的错误消失了,因为维度现在是正确的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM