R-根据另一个文件中的数据有条件地从一个文件中提取数据

Question

I have two different timestamp data file with different frequency. 我有两个具有不同频率的不同时间戳数据文件。 I want to extract data from one file (YY: DD: MM HH: MM: SS) based on the timestamp (YY: DD: MM HH:MM:00) of second data file in a range of (+_2 minutes). 我想根据第二个数据文件在（+ _2分钟）范围内的时间戳（YY：DD：MM HH：MM：00）从一个文件（YY：DD：MM HH：MM：SS）中提取数据。 I need to extract data based on each timestamp value of second data. 我需要根据第二个数据的每个时间戳值提取数据。

How I can solve it? 我该如何解决？ Do I need to apply for loop or anything else? 我需要申请循环赛或其他吗？ I am using xts package and newcomer in R 我在R中使用xts包和newcomer

Answer 1

You don't provide a reproducible example so it's hard to solve your problem, but try to adapt this code: 您没有提供可复制的示例，因此很难解决您的问题，但是请尝试修改以下代码：

INPUT : two data.frames coming from your two files: 输入：来自两个文件的两个data.frames：

df1<-data.frame(ts1=c("18: 24: 03 11: 12: 13","18: 24: 03 11: 20: 13","18: 24: 03 11: 21: 33"),b=c(1,2,3))
df2<-data.frame(ts2=c("18: 24: 03 9: 50: 00","18: 24: 03 11: 20: 00"))
df1
                    ts1 b
1 18: 24: 03 11: 12: 13 1
2 18: 24: 03 11: 20: 13 2
3 18: 24: 03 11: 21: 33 3

df2
                        ts2
    1  18: 24: 03 9: 50: 00
    2 18: 24: 03 11: 20: 00

A function f doing the match with interval dates 函数f与间隔日期进行匹配

f<-function(ts,ts2)
{

  out<-(as.POSIXct(ts,format="%y: %d: %m %H: %M: %S")<=as.POSIXct(ts2,format="%y: %d: %m %H: %M: %S")+2*60) & (as.POSIXct(ts,format="%y: %d: %m %H: %M: %S")>=as.POSIXct(ts2,format="%y: %d: %m %H: %M: %S")-2*60)
  return(as.logical(max(out)))
}

Your desired OUTPUT : 您想要的输出：

df1[unlist(lapply(as.POSIXct(df1$ts1,format="%y: %d: %m %H: %M: %S"),f,ts2=df2$ts2)),]
                    ts1 b
2 18: 24: 03 11: 20: 13 2
3 18: 24: 03 11: 21: 33 3

This is obviously just a track to help you in the implementetion of your code 显然，这只是一条帮助您实现代码的途径

Update , with a different timestamp format: 使用不同的时间戳格式更新：

Input:
    df1<-data.frame(a=c(2,5,8,2),ts1=c("2017-10-07 16:51:08.000","2017-10-07 16:51:10.000","2017-10-07 16:52:15.000","2017-10-07 16:53:25.000"))
    df2<-data.frame(ts2=c("2017-10-07 16:50:00","2017-10-07 16:51:00","2017-10-07 16:53:00"))

Same approach: 相同的方法：

f<-function(ts,ts2)
 {

   out<-(as.POSIXct(ts)<=as.POSIXct(ts2)+2*60) & (as.POSIXct(ts)>=as.POSIXct(ts2)-2*60)
   return(as.logical(max(out)))
 }

Your output: 您的输出：

df1[unlist(lapply(as.POSIXct(df1$ts1),f,ts2=df2$ts2)),]
  a                     ts1
1 2 2017-10-07 16:51:08.000
2 5 2017-10-07 16:51:10.000
3 8 2017-10-07 16:52:15.000
4 2 2017-10-07 16:53:25.000

Answer 2

for an example, we have two data table output data and input data. 例如，我们有两个数据表输出数据和输入数据。 define a for loop and create a window ( +- 2 min) and at last rbind all data. 定义一个for循环并创建一个窗口（+-2分钟），最后rbind所有数据。

here w_low = time-2min w_high = time+2min final = data.table() for (i in 1:nrow(Output)) { t_low <- Output[i,DateTime] - w_low*60 t_high <- Output_Data[i,DateTime] - w_high*60 这里w_low = time-2min w_high = time + 2min final = data.table（）for（i in 1：nrow（Output））{t_low <-Output [i，DateTime]-w_low * 60 t_high <-Output_Data [i， DateTime]-w_high * 60

input_subset <- Input[TIMESTAMP >= t_low & TIMESTAMP < t_high] n= nrow(input_subset) input_subset <-输入[TIMESTAMP> = t_low＆TIMESTAMP <t_high] n = nrow（input_subset）

input_subset[,TIMESTAMP:= difftime(TIMESTAMP, t_low, units = "secs")] input_subset [，TIMESTAMP：= difftime（TIMESTAMP，t_low，units =“ secs”）]

input_subset$output_index <- rep(i,n) final = rbind(OF,input_subset, fill=TRUE) input_subset $ output_index <-rep（i，n）final = rbind（OF，input_subset，fill = TRUE）

R-根据另一个文件中的数据有条件地从一个文件中提取数据

问题描述

2 个解决方案

解决方案1
0 已采纳 2018-04-24 07:25:40

解决方案2
0 2018-07-26 04:29:33

R-根据另一个文件中的数据有条件地从一个文件中提取数据

问题描述

2 个解决方案

解决方案1 0 已采纳 2018-04-24 07:25:40

解决方案2 0 2018-07-26 04:29:33

解决方案1
0 已采纳 2018-04-24 07:25:40

解决方案2
0 2018-07-26 04:29:33