如何查找时间序列中的缺失观测值并填充NA

Question

I have a 10 year long time series containing daily observations. 我有一个10年的时间序列，其中包含日常观察。 I've discovered that some of the rows (whole rows, not just observations) from this series are missing, which is problematic for my use case. 我发现该系列中的某些行（整个行，而不仅仅是观察值）丢失了，这对于我的用例来说是有问题的。 The dates are all in order, but a given month may start at (ymd) 2017-10-13 instead of 2017-10-01, thus missing 12 observations. 日期按顺序排列，但给定的月份可能从（ymd）2017-10-13开始而不是2017-10-01，因此缺少12个观测值。 I need to identify where there are interruptions in the sequence like this, and insert the right number of rows with the right dates, so that I can have NAs in those spots. 我需要确定这样的序列中哪些地方有中断，并插入正确数量的行和正确的日期，以便可以在这些位置使用NA。

How can I do this? 我怎样才能做到这一点？

Here is a reproducible example of a dataframe similar to mine, missing 219 of 4018 datestamped observations: 这是一个类似于我的数据框的可复制示例，其中缺少4018个带时间戳的观察结果中的219个：

df <- NULL
df$date <- seq(as.Date("2007/01/01"), as.Date("2017/12/31"), "days")
df$obs <- runif(4018)
df <- as.data.frame(df)
df_missing <- df[sample(1:nrow(df), 3799), ]

head(df_missing)
        date        obs
    1 2007-01-01 0.96428609
    2 2007-01-02 0.04199475
    3 2007-01-03 0.72729484
    4 2007-01-04 0.85591517
    5 2007-01-05 0.07373118
    6 2007-01-06 0.71093604

Answer 1

Create a data frame with a grid g of all dates and merge it with your data frame: 创建一个包含所有日期的网格g的数据框，并将其与您的数据框合并：

rng <- range(DF$date)
g <- data.frame(date = seq(rng[1], rng[2], "day"))
merge(DF, g, all = TRUE)

如何查找时间序列中的缺失观测值并填充NA

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-12-09 23:45:31

如何查找时间序列中的缺失观测值并填充NA

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-12-09 23:45:31

解决方案1
2 已采纳 2018-12-09 23:45:31