简体   繁体   English

R中时间序列数据的条件插值

[英]Conditional interpolation of time series data in R

I have time series data with N/As. 我有不适用的时间序列数据。 The data are to end up in an animated scatterplot 数据将以动画散点图结尾

Week    X   Y
 1      1   105
 2      3   110
 3      5   N/A
 4      7   130
 8     15   160
12     23   180
16     30   N/A
20     37   200

For a smooth animation, the data will be supplemented by calculated, additional values/rows. 对于平滑的动画,数据将通过计算的附加值/行进行补充。 For the X values this is simply arithmetical. 对于X值,这只是算术运算。 No problem so far. 到目前为止没有问题。

Week    X   Y
 1      1   105
        2
 2      3   110
        4
 3      5   N/A
        6
 4      7   130
        8
        9
       10
       11
       12
       13
       14
 8     15   160
       16
       17
       18
       19
       20
       21
       22
12     23   180
       24
       25
       26
       27
       28
       29
16     30   N/A
       31
       32
       33
       34
       35
       36
20     37   200

The Y values should be interpolated and there is the additional requirement, that interpolation should only appear between two consecutive values and not between values, that have a N/A between them. 应该对Y值进行插值,并且还有其他要求,即插值应该仅出现在两个连续值之间,而不能出现在两个连续值之间,而在两个连续值之间具有N / A。

Week    X   Value
 1      1   105
        2   interpolated value
 2      3   110
        4
 3      5   N/A
        6
 4      7   130
        8   interpolated value
        9   interpolated value
       10   interpolated value
       11   interpolated value
       12   interpolated value
       13   interpolated value
       14   interpolated value
 8     15   160
       16   interpolated value
       17   interpolated value
       18   interpolated value
       19   interpolated value
       20   interpolated value
       21   interpolated value
       22   interpolated value
12     23   180
       24
       25
       26
       27
       28
       29
16     30   N/A
       31
       32
       33
       34
       35
       36
20     37   200

I have already experimented with approx, converted the "original" N/A to placeholder values and tried the zoo package with na.approx etc. but don´t get it, to express a correct condition statement for this kind of "conditional approximation" or "conditional gap filling". 我已经进行了大约实验,将“原始” N / A转换为占位符值,并使用na.approx等尝试了zoo程序包,但是没有得到它,无法为这种“条件近似”表达正确的条件语句或“有条件填补空白”。 Any hint is welcome and very appreciated. 任何提示都值得欢迎和赞赏。

Thanks in advance 提前致谢

Replace the NAs with Inf, interpolate and then revert infinite values to NA. 用Inf替换NA,进行插值,然后将无限值恢复为NA。

library(zoo)

DF2 <- DF
DF2$Y[is.na(DF2$Y)] <- Inf

w <- merge(DF2, data.frame(Week = min(DF2$Week):max(DF2$Week)), by = 1, all.y = TRUE)
w$Value <- na.approx(w$Y)
w$Value[!is.finite(Value)] <- NA

giving the following where Week has been expanded to all weeks, Y is such that the original NAs are shown as Inf and the inserted NAs as NA. 给出以下内容,其中“周”已扩展到所有周,则Y表示原始NA表示为Inf,插入的NA表示为NA。 Value is the interpolated Y. 值是插值Y。

> w
   Week  X   Y Value
1     1  1 105 105.0
2     2  3 110 110.0
3     3  5 Inf    NA
4     4  7 130 130.0
5     5 NA  NA 137.5
6     6 NA  NA 145.0
7     7 NA  NA 152.5
8     8 15 160 160.0
9     9 NA  NA 165.0
10   10 NA  NA 170.0
11   11 NA  NA 175.0
12   12 23 180 180.0
13   13 NA  NA    NA
14   14 NA  NA    NA
15   15 NA  NA    NA
16   16 30 Inf    NA
17   17 NA  NA    NA
18   18 NA  NA    NA
19   19 NA  NA    NA
20   20 37 200 200.0

Note: Input DF in reproducible form: 注意:以可复制的形式输入DF

Lines <- "
Week    X   Y
 1      1   105
 2      3   110
 3      5   N/A
 4      7   130
 8     15   160
12     23   180
16     30   N/A
20     37   200"
DF <- read.table(text = Lines, header = TRUE, na.strings = "N/A")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM