简体   繁体   中英

Is there a way to make a time series out of an unevenly intervaled data frame in R?

I have a data set with paired values which I have converted into a data frame like this:

(50.0, 0.0), (49, 27.891), (48, 28.119), 
(47, 28.146), (46, 28.158), (45, 28.195), 
(44, 28.261), (43, 28.274), (42, 28.316), 
(41, 28.326), (40, 28.608), (39, 28.687), 
(38, 28.736), (37, 28.746)

numeric_data
   clean_time_numeric clean_position_numeric
1               0.000                     50
2              27.891                     49
3              28.119                     48
4              28.146                     47
5              28.158                     46

This data frame has time points and the position of a slider at that time point. I want to make a time series with intervals of 0.001 with the corresponding position of the slider in the next column, so the position would be 50 until the 27,891st row.

I have tried this piece of code with the xts and zoo packages that I saw from another post:

df1.zoo <- zoo(clean_time_numeric)
df2 <- as.data.frame(as.zoo(merge(as.xts(df1.zoo), as.xts(zoo(,seq(start(df1.zoo[1]),end(df1.zoo[89]), order.by = as.POSIXct.numeric(clean_time_numeric, tryformats = "%Y%m%d%H%M%S")))))))

but this error keeps showing up:

Error in xts(coredata(x), order.by = order.by, frequency = frequency,  : 
  order.by requires an appropriate time-based object

I am new to coding in R so I'm not really sure how to approach this or if there's an easier way to solve this, any suggestions are welcome!

Thank you,

Edit: I also tried this:

numeric_data$clean_time_numeric<- as.POSIXct.numeric(numeric_data$clean_time_numeric, tz= "GMT", origin = "1970-01-01", tryformats = "%H:%M:%S")

tseries <- data.frame(x = seq(head(numeric_data$clean_time_numeric,1),tail(numeric_data$clean_time_numeric,1),by = "sec"))

res <-merge(tseries, numeric_data, by.x="x",by.y="clean_time_numeric",all.x = TRUE)

xts(res$clean_position_numeric,order.by = res$x)

With this, only the first data point is correct - the rest are NA and it stops way before the end

A possible solution:

  1. create a sequence with 0.001 interval
  2. join this sequence to the original dataframe
  3. use zoo::na.locf to replace NA by last known value
df <- read.table(text = "
          clean_time_numeric clean_position_numeric
               0.000                     50
              27.891                     49
              28.119                     48
              28.146                     47
              28.158                     46",header=T)

time.001 <- data.frame(time = seq(min(df$clean_time_numeric), max(df$clean_time_numeric), by =0.001))

library(dplyr)
df.001 <- dplyr::full_join(df, time.001, by = c("clean_time_numeric"="time")) %>% 
       arrange(clean_time_numeric) %>%
       mutate(clean_position_numeric = zoo::na.locf(clean_position_numeric))

head(df.001)
  clean_time_numeric clean_position_numeric
1              0.000                     50
2              0.001                     50
3              0.002                     50
4              0.003                     50
5              0.004                     50
6              0.005                     50

tail(df.001)
      clean_time_numeric clean_position_numeric
28155             28.153                     47
28156             28.154                     47
28157             28.155                     47
28158             28.156                     47
28159             28.157                     47
28160             28.158                     46

Using the numeric_data data frame shown reproducibly in the Note at the end, convert it to a zoo series using read.zoo . Then set its frequency to 1000 (this is the number of points per unit interval), convert to ts class and use na.locf0 (or na.approx for linear interpolation or na.spline for spline interpolation) to fill in the NAs that were created by the conversion from zoo to ts.

library(zoo)

z <- read.zoo(numeric_data)
frequency(z) <- 1000
tt <- na.locf0(as.ts(z))

length(tt)
## [1] 28159
deltat(tt)
## [1] 0.001
range(time(tt))
## [1]  0.000 28.158

We can now

  1. leave it as a ts object, tt , or
  2. convert it to a zoo series: as.zoo(tt) , or
  3. convert it to a data frame: fortify.zoo(tt)

Note

The input in reproducible form:

numeric_data <- 
structure(list(clean_time_numeric = c(0, 27.891, 28.119, 28.146, 
28.158), clean_position_numeric = 50:46), class = "data.frame", row.names = c(NA, -5L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM