[英]How to mark the observations with given information
Considering the data collected with 5 minutes time interval with a numeric variable a
,and a discret variable acc
, which represents if there's any incident happened( 0
for no incident while 1
for incident): 考虑以5分钟的时间间隔收集的数据,其中包含一个数字变量
a
和一个离散变量acc
,该变量表示是否发生了任何事件( 0
为无事件,而1
为事件):
a<-c(1:(288*4))
t<-seq(as.POSIXct("2016-01-01 00:05:00"), as.POSIXct("2016-01-05 00:00:00"), by = '5 min')
acc<-rep(0,288*4)
df<-data.frame(t,a,acc)
Now I have another data set which has the time(accurates to 1 sec) at which the incidents happened during the collection period: 现在,我有了另一个数据集,该数据集具有在收集期间发生事件的时间(精确到1秒):
T<-sample(seq(as.POSIXct("2016-01-01 00:05:00"), as.POSIXct("2016-01-05 00:00:00"), by = '1 sec'),size = 5)
I want to mark the nearest 2 prior observation's acc
as 1 according to the time in T
. 我想根据
T
的时间将最近的2个先前观察的acc
标记为1。 For example, if the incident happened at 2016-01-02 07:13:23
, the observations' acc
with t
of 2016-01-02 07:05:00
and 2016-01-02 07:10:00
are marked as 1
例如,如果事件发生在
2016-01-02 07:13:23
,观察值acc
与t
的2016-01-02 07:05:00
和2016-01-02 07:10:00
被标记为1
How could I manage to do this? 我怎样才能做到这一点?
ind <- findInterval(T, df$t)
df$acc[c(ind, ind + 1)] <- 1
One way could be: 一种方法是:
library(lubridate)
df$acc=apply(sapply(T,function(x) x %within% interval((df$t - minutes(4)-seconds(59)),(df$t + minutes(4)+seconds(59)))),1,sum)
lubridate
allows for the easy manipulation of dates, minutes(x)
and seconds(x)
adds x minutes or second to a period object. lubridate
允许轻松地操作日期, minutes(x)
和seconds(x)
将x分钟或秒添加到周期对象。
interval()
is used to create a time interval confined by the time in df$t
± 4min59s. interval()
用于创建一个时间间隔,该时间间隔以df$t
±4min59s为单位。
sapply()
is used to check if any of the time in T is within the interval. sapply()
用于检查T中的任何时间是否在该间隔内。
apply()
is used to collapse the results of sapply()
(it outputs 1 column for each element in T) apply()
用于折叠sapply()
的结果(它为T中的每个元素输出1列)
If T
contains a value that is exactly equal to one in df$t
such as 2016-01-04 12:05:00 CET
this will only put 1 for this one. 如果
T
包含的值恰好等于df$t
例如2016-01-04 12:05:00 CET
此值仅将1放入。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.