简体   繁体   中英

Is there a way to join two tables together in pandas on temporal columns by specifying range of time?

Say I have two data frames df1 and df2. They both have columns of the form

Date/Time
01-06-2013 23:00:00
02-06-2013 01:00:00
02-06-2013 21:00:00
02-06-2013 22:00:00
02-06-2013 23:00:00

I want a function

join_temporal(range=<num>, unit= <"seconds" xor "minutes" xor "hours" xor "days">, df1, df2)

So if I call

join_temporal(range=3, unit="days", df1, df2)

I get the rows joined that are in 3 days range.

If I call

join_temporal(range=2, unit="hours", df1, df2)

I get the rows joined that are in 2 hrs range.

Is there some good pandas options to assist implementation of join_temporal function?

I can't think of any builtin methods of pandas which can perform this. My advice is to create a new column which contains the datetime represented in a less granular form (ie day or hour). If you want to round to a fixed frequency, use round . Otherwise, DateOffset should help you find the nearest day/month/year. From there, you could then use groupby to cluster on that column.

I'm not really sure what you mean by "join" the rows since you didn't supply and sample data. There could be better solutions depending on this. This would also depend on if there is a maximum size of any one group.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM