简体   繁体   English

如何查找时间序列中的缺失观测值并填充NA

[英]How to FIND missing observations within a time series and fill with NAs

I have a 10 year long time series containing daily observations. 我有一个10年的时间序列,其中包含日常观察。 I've discovered that some of the rows (whole rows, not just observations) from this series are missing, which is problematic for my use case. 我发现该系列中的某些行(整个行,而不仅仅是观察值)丢失了,这对于我的用例来说是有问题的。 The dates are all in order, but a given month may start at (ymd) 2017-10-13 instead of 2017-10-01, thus missing 12 observations. 日期按顺序排列,但给定的月份可能从(ymd)2017-10-13开始而不是2017-10-01,因此缺少12个观测值。 I need to identify where there are interruptions in the sequence like this, and insert the right number of rows with the right dates, so that I can have NAs in those spots. 我需要确定这样的序列中哪些地方有中断,并插入正确数量的行和正确的日期,以便可以在这些位置使用NA。

How can I do this? 我怎样才能做到这一点?

Here is a reproducible example of a dataframe similar to mine, missing 219 of 4018 datestamped observations: 这是一个类似于我的数据框的可复制示例,其中缺少4018个带时间戳的观察结果中的219个:

df <- NULL
df$date <- seq(as.Date("2007/01/01"), as.Date("2017/12/31"), "days")
df$obs <- runif(4018)
df <- as.data.frame(df)
df_missing <- df[sample(1:nrow(df), 3799), ]

head(df_missing)
        date        obs
    1 2007-01-01 0.96428609
    2 2007-01-02 0.04199475
    3 2007-01-03 0.72729484
    4 2007-01-04 0.85591517
    5 2007-01-05 0.07373118
    6 2007-01-06 0.71093604

Create a data frame with a grid g of all dates and merge it with your data frame: 创建一个包含所有日期的网格g的数据框,并将其与您的数据框合并:

rng <- range(DF$date)
g <- data.frame(date = seq(rng[1], rng[2], "day"))
merge(DF, g, all = TRUE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R时间序列-识别缺失的观测值(时间戳)并插入NA以创建给定长度的时间序列 - R timeseries - identify missing observations (timestamps) and insert NAs to create time series of given length 如何修复多个观测值中缺少日期的时间序列? - How to fix a time series with missing dates across multiple observations? 在时间序列上向后替换NA的数量有限 - Backward replacement of NAs in time series only to a limited number of observations 如何在数据框中组合两个观察结果并用相互矛盾的条目填充 NA - How to combine two observations in a data frame and fill NAs with contradicting entries 在可扩展的时间序列中查找多个观测值的平均值 - Find the mean of multiple observations in an extensible time series 如何在R中互相找到特定时间范围内的观测值 - How to find observations within a certain time range of each other in R 仅针对特定数量的日期在时间序列中填充 NA - Fill NAs in a time series for specific number of dates only 如何对时间序列中的缺失值进行插值,受限于连续NA(R)的数量? - How to interpolate missing values in a time series, limited by the number of sequential NAs (R)? 如何填充复制的时间序列数据的缺失值? - How to fill the missing values for a replicated time series data? 如何填写每月时间序列数据的缺失行? - How can I fill in missing rows for monthly time series data?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM