[英]Merge two pandas dataframe with conditions
我有兩個需要在復雜條件下合並的數據框。 這里是兩個數據框:
dock_id dock_name avail_bikes avail_docks \
0 3082 Hope St & Union Ave 8 16
1 468 Broadway & W 55 St 0 59
2 407 Henry St & Poplar St 22 15
3 3016 Kent Ave & N 7 St 29 16
status_key datehour ... visi vism wdird wdire \
0 1 2016-06-01 19:25:00 ... NaN NaN NaN NaN
1 1 2016-06-01 19:25:00 ... NaN NaN NaN NaN
2 1 2016-06-01 19:25:00 ... NaN NaN NaN NaN
3 1 2016-06-01 19:25:00 ... NaN NaN NaN NaN
tot_docks _lat _long in_service
0 25 40.711674 -73.951413 1
1 59 40.765265 -73.981923 1
2 37 40.700469 -73.991454 1
3 47 40.720368 -73.961651 1
....
Start Date/Time End Date/Time Event Agency \
0 01/01/2016 12:00:00 AM 01/01/2016 02:00:00 AM Parks Department
1 01/02/2016 12:00:00 AM 01/02/2016 02:00:00 AM Parks Department
2 01/03/2016 12:00:00 AM 01/03/2016 02:00:00 AM Parks Department
3 01/04/2016 12:00:00 AM 01/04/2016 02:00:00 AM Parks Department
latitude longitude
0 40.782865 -73.965355
1 40.782865 -73.965355
2 40.782865 -73.965355
3 40.782865 -73.965355
4 40.782865 -73.965355
我想加入他們的條件:
Start Date/Time <= datehour <= End Date/Time and distance(_lat,_lon,latitude,longitude) < d
我知道可以合並數據,然后對其應用過濾器,但是數據集太大(10263241行和401080行)。 因此,我認為這種方法不會在合理的時間內起作用。
您知道如何解決這個問題嗎?
感謝您的回答!
將熊貓作為pd導入... new_frame = pd.merge(dataframe1,dataframe2,有條件)
如果是更高級的合並,我們可以指定列名以及dataframe [['column1','column2',...]]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.