How to FIND missing observations within a time series and fill with NAs

Question

I have a 10 year long time series containing daily observations. I've discovered that some of the rows (whole rows, not just observations) from this series are missing, which is problematic for my use case. The dates are all in order, but a given month may start at (ymd) 2017-10-13 instead of 2017-10-01, thus missing 12 observations. I need to identify where there are interruptions in the sequence like this, and insert the right number of rows with the right dates, so that I can have NAs in those spots.

How can I do this?

Here is a reproducible example of a dataframe similar to mine, missing 219 of 4018 datestamped observations:

df <- NULL
df$date <- seq(as.Date("2007/01/01"), as.Date("2017/12/31"), "days")
df$obs <- runif(4018)
df <- as.data.frame(df)
df_missing <- df[sample(1:nrow(df), 3799), ]

head(df_missing)
        date        obs
    1 2007-01-01 0.96428609
    2 2007-01-02 0.04199475
    3 2007-01-03 0.72729484
    4 2007-01-04 0.85591517
    5 2007-01-05 0.07373118
    6 2007-01-06 0.71093604

Answer 1

Create a data frame with a grid g of all dates and merge it with your data frame:

rng <- range(DF$date)
g <- data.frame(date = seq(rng[1], rng[2], "day"))
merge(DF, g, all = TRUE)

How to FIND missing observations within a time series and fill with NAs

Question

1 answers

solution1
2 ACCPTED 2018-12-09 23:45:31

How to FIND missing observations within a time series and fill with NAs

Question

1 answers

solution1 2 ACCPTED 2018-12-09 23:45:31

solution1
2 ACCPTED 2018-12-09 23:45:31