I have two different timestamp data file with different frequency. I want to extract data from one file (YY: DD: MM HH: MM: SS) based on the timestamp (YY: DD: MM HH:MM:00) of second data file in a range of (+_2 minutes). I need to extract data based on each timestamp value of second data.
How I can solve it? Do I need to apply for loop or anything else? I am using xts package and newcomer in R
You don't provide a reproducible example so it's hard to solve your problem, but try to adapt this code:
INPUT : two data.frames coming from your two files:
df1<-data.frame(ts1=c("18: 24: 03 11: 12: 13","18: 24: 03 11: 20: 13","18: 24: 03 11: 21: 33"),b=c(1,2,3))
df2<-data.frame(ts2=c("18: 24: 03 9: 50: 00","18: 24: 03 11: 20: 00"))
df1
ts1 b
1 18: 24: 03 11: 12: 13 1
2 18: 24: 03 11: 20: 13 2
3 18: 24: 03 11: 21: 33 3
df2
ts2
1 18: 24: 03 9: 50: 00
2 18: 24: 03 11: 20: 00
A function f
doing the match with interval dates
f<-function(ts,ts2)
{
out<-(as.POSIXct(ts,format="%y: %d: %m %H: %M: %S")<=as.POSIXct(ts2,format="%y: %d: %m %H: %M: %S")+2*60) & (as.POSIXct(ts,format="%y: %d: %m %H: %M: %S")>=as.POSIXct(ts2,format="%y: %d: %m %H: %M: %S")-2*60)
return(as.logical(max(out)))
}
Your desired OUTPUT :
df1[unlist(lapply(as.POSIXct(df1$ts1,format="%y: %d: %m %H: %M: %S"),f,ts2=df2$ts2)),]
ts1 b
2 18: 24: 03 11: 20: 13 2
3 18: 24: 03 11: 21: 33 3
This is obviously just a track to help you in the implementetion of your code
Update , with a different timestamp format:
Input:
df1<-data.frame(a=c(2,5,8,2),ts1=c("2017-10-07 16:51:08.000","2017-10-07 16:51:10.000","2017-10-07 16:52:15.000","2017-10-07 16:53:25.000"))
df2<-data.frame(ts2=c("2017-10-07 16:50:00","2017-10-07 16:51:00","2017-10-07 16:53:00"))
Same approach:
f<-function(ts,ts2)
{
out<-(as.POSIXct(ts)<=as.POSIXct(ts2)+2*60) & (as.POSIXct(ts)>=as.POSIXct(ts2)-2*60)
return(as.logical(max(out)))
}
Your output:
df1[unlist(lapply(as.POSIXct(df1$ts1),f,ts2=df2$ts2)),]
a ts1
1 2 2017-10-07 16:51:08.000
2 5 2017-10-07 16:51:10.000
3 8 2017-10-07 16:52:15.000
4 2 2017-10-07 16:53:25.000
for an example, we have two data table output data and input data. define a for loop and create a window ( +- 2 min) and at last rbind all data.
here w_low = time-2min w_high = time+2min final = data.table() for (i in 1:nrow(Output)) { t_low <- Output[i,DateTime] - w_low*60 t_high <- Output_Data[i,DateTime] - w_high*60
input_subset <- Input[TIMESTAMP >= t_low & TIMESTAMP < t_high] n= nrow(input_subset)
input_subset[,TIMESTAMP:= difftime(TIMESTAMP, t_low, units = "secs")]
input_subset$output_index <- rep(i,n) final = rbind(OF,input_subset, fill=TRUE)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.