简体   繁体   English

将时间序列数据子集化为定义的间隔

[英]Subset time series data into defined intervals

I am trying to subset, or filter, data into a defined time interval. 我试图将数据子集或过滤到定义的时间间隔。 Can you help me subset the following data into 2-minute time intervals? 你能帮助我将以下数据分成2分钟的时间间隔吗? I have looked at Lubridate, split(), and cut() but cannot figure out how to properly do this. 我看过Lubridate,split()和cut(),但无法弄清楚如何正确地做到这一点。

I've looked at this post on SO , however it doesn't seem to be what I need. 在SO上看过这篇文章 ,但它似乎并不是我需要的。

Note that columns 1 and 2 are character classes, column 3 is a POSIXct class. 请注意,第1列和第2列是字符类,第3列是POSIXct类。 If possible I'd like to have the solution use the datetime column (POSIXct). 如果可能,我希望解决方案使用datetime列(POSIXct)。

         date  time            datetime use..kW.     gen..kW. Grid..kW.
120 12/31/2013 21:59 2013-12-31 21:59:00 1.495833 -0.003083333  1.495833
121 12/31/2013 21:58 2013-12-31 21:58:00 1.829583 -0.003400000  1.829583
122 12/31/2013 21:57 2013-12-31 21:57:00 1.977283 -0.003450000  1.977283
123 12/31/2013 21:56 2013-12-31 21:56:00 2.494750 -0.003350000  2.494750
124 12/31/2013 21:55 2013-12-31 21:55:00 2.218283 -0.003500000  2.218283
125 12/31/2013 21:54 2013-12-31 21:54:00 2.008283 -0.003566667  2.008283
126 12/31/2013 21:53 2013-12-31 21:53:00 2.010917 -0.003600000  2.010917
127 12/31/2013 21:52 2013-12-31 21:52:00 2.011867 -0.003583333  2.011867
128 12/31/2013 21:51 2013-12-31 21:51:00 2.015033 -0.003600000  2.015033
129 12/31/2013 21:50 2013-12-31 21:50:00 2.096550 -0.003850000  2.096550

The new subset would just take the data from every two minute interval and look like: 新子集只会从每两分钟间隔获取数据,如下所示:

      date  time            datetime use..kW.     gen..kW. Grid..kW.
121 12/31/2013 21:58 2013-12-31 21:58:00 1.829583 -0.003400000  1.829583
123 12/31/2013 21:56 2013-12-31 21:56:00 2.494750 -0.003350000  2.494750
125 12/31/2013 21:54 2013-12-31 21:54:00 2.008283 -0.003566667  2.008283
127 12/31/2013 21:52 2013-12-31 21:52:00 2.011867 -0.003583333  2.011867
129 12/31/2013 21:50 2013-12-31 21:50:00 2.096550 -0.003850000  2.096550

For my data, I am actually going to be doing 5 and 15 minute intervals. 对于我的数据,我实际上将间隔5和15分钟。 But if I get a good solution for the data above and a 2 minute interval, I should be able to appropriately adjust the code to fit my needs. 但如果我得到一个很好的解决方案,上面的数据和2分​​钟的间隔,我应该能够适当调整代码,以满足我的需要。

Using cut and plyr::ddply : 使用cutplyr::ddply

groups <- cut(as.POSIXct(df$datetime), breaks="2 min")
library(plyr)
ddply(df, "groups", tail, 1)[, -1]
#         date  time            datetime use..kW.     gen..kW. Grid..kW.
# 1 12/31/2013 21:50 2013-12-31 21:50:00 2.096550 -0.003850000  2.096550
# 2 12/31/2013 21:52 2013-12-31 21:52:00 2.011867 -0.003583333  2.011867
# 3 12/31/2013 21:54 2013-12-31 21:54:00 2.008283 -0.003566667  2.008283
# 4 12/31/2013 21:56 2013-12-31 21:56:00 2.494750 -0.003350000  2.494750
# 5 12/31/2013 21:58 2013-12-31 21:58:00 1.829583 -0.003400000  1.829583

Or 要么

arrange(ddply(df, "groups", tail, 1)[, -1], datetime, decreasing=TRUE)

if you want to sort it the other way round. 如果你想以相反的方式对它进行排序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM