简体   繁体   中英

How do I convert a dataset into time series data in R?

I have a data frame that I am attempting to turn into a time series. The following is an image of a few rows of data. As you can see, the data frame does not include every single day . So, is there a way to change this into time series data? Because the code I have here does not account for the missing days.

df3 <- ts(df2$Freq, start=c(2015), end=c(2017), frequency=365)

在此处输入图片说明

structure(list(Police_Killings = structure(c(16437, 16438, 16439, 
16440, 16441, 16442), class = "Date"), Freq = c(2L, 1L, 3L, 1L, 
4L, 4L)), row.names = c(NA, 6L), class = "data.frame")

ts is normally not used for daily series. It is mostly used for regularly spaced monthly and quarterly series. You can represent this as a zoo or xts series.

library(zoo)

d <- data.frame(date = "2017/01/01", Freq = 3)  # sample data
z <- read.zoo(d, format = "%Y/%m/%d")

Since the data which you have is a dataframe, you can use complete from tidyr to fill in the missing dates and fill their Freq column with 0.

tidyr::complete(df, Police_Killings = seq(min(Police_Killings), 
      max(Police_Killings), by = "1 day"), fill = list(Freq = 0))

You can then convert this to a time-series object as needed for further processing.

The lubridate() package is great for converting into date formats.

library(lubridate)

df3$Police_Killings <- ymd(df3$Police_Killings)

Base R solution for converting data frame to a (daily) time-series (ts) object:

df$year_of_police_killings <- as.numeric(format(df$Police_Killings, "%Y"))

date_range <- range(df$Police_Killings)

year_range <- range(df$year_of_police_killings)

min_step_in_min_year <- min(as.numeric(strftime(df$Police_Killings[as.numeric(format(df$Police_Killings, "%Y")) == min(year_range)], "%j")))

max_step_in_max_year <- max(as.numeric(strftime(df$Police_Killings[as.numeric(format(df$Police_Killings, "%Y")) == max(year_range)], "%j")))

dat_ts <- ts(df$Freq, start = c(min(year_range), min_step_in_min_year),

            end = c(max(year_range), max_step_in_max_year), frequency = 365)

You may also use the tsibble package to represent it as a time series:

library(tsibble)
ts_killings <- as_tsibble(df, index = Police_Killings)

You may fill gaps with

ts_killings %>% 
    fill_gaps(Freq = 0)

A nice introduction in working with tsibble can be found in the vignette: Introduction to tsibble

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM