简体   繁体   中英

Pandas Join Two Tables Based On Time Condition

I'm trying to join two tables in Pandas based on timestamp. Basically the structure looks something like this:

Table 2

Timestamp          Truck     MineX           MineY
2016-08-27 01:10    CT77    -11346.36655    -650404.405
2016-08-27 01:12    CT45    -11596.88137    -648294.056
2016-08-27 01:13    CT67    -11953.16118    -648325.114
2016-08-27 01:13    CT75    -11326.54075    -650447.462
2016-08-27 01:14    CT79    -11380.27834    -650425.968
2016-08-27 01:15    CT26    -9493.153286    -652313.633
2016-08-27 01:16    CT73    -11527.47602    -650210.723
2016-08-27 01:16    CT40    -11596.90867    -648260.214
2016-08-27 01:17    CT26    -9493.153286    -652313.633
2016-08-27 01:17    CT80    -11363.34558    -650385.959
2016-08-27 01:17    CT72    -11527.47355    -650213.8

Table 1

Truck   LoadLocation    Tonnes  ArriveTimestamp
CT70    338-001       261         2016-02-21 00:23
CT66    338-001       271         2016-02-21 00:31
CT62    338-001       264         2016-02-21 00:45
CT73    338-001       254         2016-02-21 00:54
CT71    338-001       250         2016-02-21 01:04
CT39    338-001       182.172     2016-02-21 01:11
CT62    338-001       285         2016-02-21 01:19
CT70    338-001       282         2016-02-21 01:25
CT73    338-001       250         2016-02-21 01:30
CT73    338-001       275         2016-02-21 01:35
CT64    338-001       253         2016-02-21 01:42

Table 1 and Table 2 need to be joined, where Timestamp and ArriveTimeStamp are within one minute of each other, and the Truck ID is the same. A left join is preferred where records from Table 2 are thrown out if there is no match

You can use merge :

df = pd.merge(df1, df2, on='Truck', how='left')
print (df)
   Truck LoadLocation   Tonnes     ArriveTimestamp           Timestamp  \
0   CT70      338-001  261.000 2016-02-21 00:23:00                 NaT   
1   CT66      338-001  271.000 2016-02-21 00:31:00                 NaT   
2   CT62      338-001  264.000 2016-02-21 00:45:00                 NaT   
3   CT73      338-001  254.000 2016-02-21 00:54:00 2016-08-27 01:16:00   
4   CT71      338-001  250.000 2016-02-21 01:04:00                 NaT   
5   CT39      338-001  182.172 2016-02-21 01:11:00                 NaT   
6   CT62      338-001  285.000 2016-02-21 01:19:00                 NaT   
7   CT70      338-001  282.000 2016-02-21 01:25:00                 NaT   
8   CT73      338-001  250.000 2016-02-21 01:30:00 2016-08-27 01:16:00   
9   CT73      338-001  275.000 2016-02-21 01:35:00 2016-08-27 01:16:00   
10  CT64      338-001  253.000 2016-02-21 01:42:00                 NaT   

          MineX       MineY  
0           NaN         NaN  
1           NaN         NaN  
2           NaN         NaN  
3  -11527.47602 -650210.723  
4           NaN         NaN  
5           NaN         NaN  
6           NaN         NaN  
7           NaN         NaN  
8  -11527.47602 -650210.723  
9  -11527.47602 -650210.723  
10          NaN         NaN  

with boolean indexing , where filter absolute differences in datetimes - sample return empty DataFrame :

print ((df.Timestamp - df.ArriveTimestamp).astype('timedelta64[s]'))
0            NaN
1            NaN
2            NaN
3     16244520.0
4            NaN
5            NaN
6            NaN
7            NaN
8     16242360.0
9     16242060.0
10           NaN
dtype: float64

print ((df.Timestamp - df.ArriveTimestamp).astype('timedelta64[s]').abs() < 60)
0     False
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8     False
9     False
10    False
dtype: bool
Empty DataFrame

print (df[(df.Timestamp - df.ArriveTimestamp).astype('timedelta64[s]').abs() < 60])
Empty DataFrame
Columns: [Truck, LoadLocation, Tonnes, ArriveTimestamp, Timestamp, MineX, MineY]
Index: []
import pandas as pd
df=pd.merge(left=Table1,right=Table2, left_on=['TimeStamp1'], right_on=['TimeStamp2'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM