簡體   English   中英

根據最接近的時間戳在R中連接兩個數據幀

[英]Join two data frames in R based on closest timestamp

嗨我有兩個表(下面的table1和table2),並希望根據最接近的時間戳加入它們,以形成expected_output。 如果可能的話,涉及dplyr的某種解決方案會很好,但如果它進一步使事情變得復雜則不會。

table1 = 
structure(list(date = structure(c(1437051300, 1434773700, 1431457200
), class = c("POSIXct", "POSIXt"), tzone = ""), val1 = c(94L, 
33L, 53L)), .Names = c("date", "val1"), row.names = c(NA, -3L
), class = "data.frame")

table2 = 
structure(list(date = structure(c(1430248288, 1435690482, 1434050843
), class = c("POSIXct", "POSIXt"), tzone = ""), val2 = c(67L, 
90L, 18L)), .Names = c("date", "val2"), row.names = c(NA, -3L
), class = "data.frame")

expected_output = 
structure(list(date = structure(c(1437051300, 1434773700, 1431457200
), class = c("POSIXct", "POSIXt"), tzone = ""), val1 = c(94L,
33L, 53L), val2 = c(90L, 18L, 67L)), .Names = c("date", "val1", 
"val2"), row.names = c(NA, -3L), class = "data.frame")

使用data.tableroll = "nearest"滾動連接功能:

require(data.table) # v1.9.6+
setDT(table1)[, val2 := setDT(table2)[table1, val2, on = "date", roll = "nearest"]]

這里, val2列是通過使用roll = "nearest"選項在列date上執行連接來創建的。 對於table1$date每一行,計算table2$date最接近的匹配行,並提取相應行的val2

這可能會很慢,但......

d   <- function(x,y) abs(x-y) # define the distance function
idx <- sapply( table1$date, function(x) which.min( d(x,table2$date) )) # find matches

cbind(table1,table2[idx,-1,drop=FALSE])
#                  date val1 val2
# 2 2015-07-16 08:55:00   94   90
# 3 2015-06-20 00:15:00   33   18
# 1 2015-05-12 15:00:00   53   67

另一種構造idxmax.col(-outer(table1$date, table2$date, d))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM