简体   繁体   English

如何在R中重新采样和插入时间序列数据?

[英]How do I resample and interpolate timeseries data in R?

I have measurements that have been recorded approximately every 5 minutes: 我的测量值大约每5分钟记录一次:

2012-07-09T05:30:01+02:00   1906.1  1069.2  1093.2  3   1071.0  1905.7  
2012-07-09T05:35:02+02:00   1905.7  1069.2  1093.0  0   1071.5  1905.7  
2012-07-09T05:40:02+02:00   1906.1  1068.7  1093.2  0   1069.4  1905.7  
2012-07-09T05:45:02+02:00   1905.7  1068.4  1093.0  1   1069.6  1905.7  
2012-07-09T05:50:02+02:00   1905.7  1068.2  1093.0  4   1073.3  1905.7  

The first column is the data's timestamp. 第一列是数据的时间戳。 The remaining columns are the recorded data. 其余列是记录的数据。

I need to resample my data so that I have one row every 15 minutes, eg something like: 我需要重新采样我的数据,以便每15分钟有一行,例如:

2012-07-09T05:15:00 XX XX XX XX XX XX
2012-07-09T05:30:00 XX XX XX XX XX XX
....

(In addition, there may be gaps in the recorded data and I would like gaps of more than, say, one hour to be replaced with a row of NA values.) (此外,记录的数据可能存在间隙,我希望用一行NA值代替一小时以上的间隙。)

I can think of several ways to program this by hand, but is there built-in support for doing that kind of stuff in R? 我可以想到几种方法来手动编程,但有没有内置的支持在R中做那种东西? I've looked at the different libraries for dealing with timeseries data ( zoo , chron etc) but couldn't find anything satisfactory. 我看过不同的库来处理时间序列数据( zoochron等),但找不到任何令人满意的东西。

You can use approx or the related approxfun . 您可以使用approx或相关的approxfun If t is the vector consisting of the timepoints where your data was sampled and if y is the vector with the data then f <- approxfun(t,y) creates a function f that linearly interpolates the data points in between the time points. 如果t是由数据采样的时间点组成的向量,并且如果y是带有数据的向量,则f <- approxfun(t,y)创建一个函数f ,它在时间点之间线性插值数据点。

Example: 例:

# irregular time points at which data was sampled
t <- c(5,10,15,25,30,40,50)
# measurements 
y <- c(4.3,1.2,5.4,7.6,3.2,1.2,3.7)

f <- approxfun(t,y)

# get interpolated values for time points 5, 20, 35, 50
f(seq(from=5,to=50,by=15))
[1] 4.3 6.5 2.2 3.7

There's a good discussion of this on CrossValidated: https://stats.stackexchange.com/questions/31666/how-can-i-align-synchronize-two-signals . 在CrossValidated上有一个很好的讨论: https ://stats.stackexchange.com/questions/31666/how-can-i-align-synchronize-two-signals。 The author of that answer "rolled his own" interpolate-and-resample code. 该答案的作者“推出了他自己的”插值和重新采样代码。

If you are looking for built-in downsampling (upsampling is not supported), you can also use the xts package. 如果您正在寻找内置的下采样(不支持上采样),您也可以使用xts包。

data(sample_matrix)
samplexts <- as.xts(sample_matrix)
to.monthly(samplexts)
to.yearly(samplexts)

你应该看一下openair包中有很多用于播放时间序列数据的“工具”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM