I have two .csv files called 'tag.csv' and 'wind.csv' that I would like to merge based on the 'Date' column. However, as you can see from the data below the times do not match exactly.
tag.csv
Date
13/12/2014 05:11
13/12/2014 05:43
13/12/2014 06:34
wind.csv
Date
13/12/2014 05:00
13/12/2014 06:00
13/12/2014 07:00
I am using the following simple script.
tag<- read.csv("tag.csv")
wind<- read.csv("wind.csv")
myfulldata = merge(tag, wind)
Is there a command I could add so that the data will merge with the closest time? For example, 13/12/2014 05:11 will merge with 13/12/2014 05:00
data.table
package has its own merging method and an option called roll
that allows closest matches. eg
library(data.table)
dt1 <- data.table(a = c(1,2,3,4), b=1:4)
dt2 <- data.table(a = c(1.6,2.1,3.2,4.9), c=1:4)
setkeyv(dt1, "a")
setkeyv(dt2, "a")
dt1[dt2, roll="nearest"]
See Join R data.tables where key values are not exactly equal--combine rows with closest times for another example.
To be robust though, I would probably instead do some datetime manipulation to force them to both be to the nearest hour, and then do a nearest join if all else fails
edit: this method won't work on data.frame
objects, if you haven't used data.table
before maybe stay simple and round the hours
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.