简体   繁体   English

有没有办法从 R 中不均匀间隔的数据帧中制作时间序列?

[英]Is there a way to make a time series out of an unevenly intervaled data frame in R?

I have a data set with paired values which I have converted into a data frame like this:我有一个带有配对值的数据集,我已将其转换为如下数据框:

(50.0, 0.0), (49, 27.891), (48, 28.119), 
(47, 28.146), (46, 28.158), (45, 28.195), 
(44, 28.261), (43, 28.274), (42, 28.316), 
(41, 28.326), (40, 28.608), (39, 28.687), 
(38, 28.736), (37, 28.746)

numeric_data
   clean_time_numeric clean_position_numeric
1               0.000                     50
2              27.891                     49
3              28.119                     48
4              28.146                     47
5              28.158                     46

This data frame has time points and the position of a slider at that time point.这个数据帧有时间点和那个时间点的slider的position。 I want to make a time series with intervals of 0.001 with the corresponding position of the slider in the next column, so the position would be 50 until the 27,891st row.我想在下一列中使用 slider 的相应 position 制作一个间隔为 0.001 的时间序列,因此 position 将是 50 直到第 27 行。

I have tried this piece of code with the xts and zoo packages that I saw from another post:我已经用我在另一篇文章中看到的xtszoo包尝试了这段代码:

df1.zoo <- zoo(clean_time_numeric)
df2 <- as.data.frame(as.zoo(merge(as.xts(df1.zoo), as.xts(zoo(,seq(start(df1.zoo[1]),end(df1.zoo[89]), order.by = as.POSIXct.numeric(clean_time_numeric, tryformats = "%Y%m%d%H%M%S")))))))

but this error keeps showing up:但此错误不断出现:

Error in xts(coredata(x), order.by = order.by, frequency = frequency,  : 
  order.by requires an appropriate time-based object

I am new to coding in R so I'm not really sure how to approach this or if there's an easier way to solve this, any suggestions are welcome!我是 R 编码的新手,所以我不确定如何解决这个问题,或者是否有更简单的方法来解决这个问题,欢迎提出任何建议!

Thank you,谢谢,

Edit: I also tried this:编辑:我也试过这个:

numeric_data$clean_time_numeric<- as.POSIXct.numeric(numeric_data$clean_time_numeric, tz= "GMT", origin = "1970-01-01", tryformats = "%H:%M:%S")

tseries <- data.frame(x = seq(head(numeric_data$clean_time_numeric,1),tail(numeric_data$clean_time_numeric,1),by = "sec"))

res <-merge(tseries, numeric_data, by.x="x",by.y="clean_time_numeric",all.x = TRUE)

xts(res$clean_position_numeric,order.by = res$x)

With this, only the first data point is correct - the rest are NA and it stops way before the end有了这个,只有第一个数据点是正确的 - rest 是 NA 并且它在结束前停止

A possible solution:一个可能的解决方案:

  1. create a sequence with 0.001 interval创建一个间隔为 0.001 的序列
  2. join this sequence to the original dataframe将此序列加入原始 dataframe
  3. use zoo::na.locf to replace NA by last known value使用zoo::na.locf将 NA 替换为最后一个已知值
df <- read.table(text = "
          clean_time_numeric clean_position_numeric
               0.000                     50
              27.891                     49
              28.119                     48
              28.146                     47
              28.158                     46",header=T)

time.001 <- data.frame(time = seq(min(df$clean_time_numeric), max(df$clean_time_numeric), by =0.001))

library(dplyr)
df.001 <- dplyr::full_join(df, time.001, by = c("clean_time_numeric"="time")) %>% 
       arrange(clean_time_numeric) %>%
       mutate(clean_position_numeric = zoo::na.locf(clean_position_numeric))

head(df.001)
  clean_time_numeric clean_position_numeric
1              0.000                     50
2              0.001                     50
3              0.002                     50
4              0.003                     50
5              0.004                     50
6              0.005                     50

tail(df.001)
      clean_time_numeric clean_position_numeric
28155             28.153                     47
28156             28.154                     47
28157             28.155                     47
28158             28.156                     47
28159             28.157                     47
28160             28.158                     46

Using the numeric_data data frame shown reproducibly in the Note at the end, convert it to a zoo series using read.zoo .使用最后在注释中可重复显示的numeric_data数据框,使用read.zoo将其转换为动物园系列。 Then set its frequency to 1000 (this is the number of points per unit interval), convert to ts class and use na.locf0 (or na.approx for linear interpolation or na.spline for spline interpolation) to fill in the NAs that were created by the conversion from zoo to ts.然后将其频率设置为 1000(这是每单位间隔的点数),转换为 ts class 并使用na.locf0 (或na.approx用于线性插值或na.spline用于样条插值)填充由 zoo 到 ts 的转换创建。

library(zoo)

z <- read.zoo(numeric_data)
frequency(z) <- 1000
tt <- na.locf0(as.ts(z))

length(tt)
## [1] 28159
deltat(tt)
## [1] 0.001
range(time(tt))
## [1]  0.000 28.158

We can now我们现在可以

  1. leave it as a ts object, tt , or将其保留为 ts object, tt
  2. convert it to a zoo series: as.zoo(tt) , or将其转换为动物园系列: as.zoo(tt) ,或
  3. convert it to a data frame: fortify.zoo(tt)将其转换为数据框: fortify.zoo(tt)

Note笔记

The input in reproducible form:可重现形式的输入:

numeric_data <- 
structure(list(clean_time_numeric = c(0, 27.891, 28.119, 28.146, 
28.158), clean_position_numeric = 50:46), class = "data.frame", row.names = c(NA, -5L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM