[英]Python Pandas Lookup another Dataframe return Multiple Matches
I have a dataframe (customers) of customers, with a unique id.我有一个具有唯一 ID 的客户数据框(客户)。
I need to look at another dataframe (meetings) of meetings that have been held (lookup using the customer id) and return the date of the most recent meeting.我需要查看已举行的会议的另一个数据框(会议)(使用客户 ID 查找)并返回最近一次会议的日期。
Most customers will have had multiple meetings, but some customers will have had none.大多数客户会举行多次会议,但有些客户可能没有。 In this case, I need to return 0.
在这种情况下,我需要返回 0。
Customers
id name
1607 duck
1622 dog
1972 cat
2204 bird
2367 fish
2373 elephant
2386 moose
2413 mammal
2418 man
22120 goldfish
6067 toucan
83340 capybara
meetings as below:会议如下:
meetings
customer_id date meeting_id
1607 25/02/2019 1235
1607 11/03/2019 2315
1607 11/03/2019 5483
1622 16/11/2018 32125
1972 13/02/2019 6548
2204 4/02/2019 6542
2204 8/11/2018 8755
2367 22/01/2019 6545
2373 14/12/2018 8766
2373 18/01/2019 5448
2386 18/02/2019 32125
2386 18/02/2019 5458
2413 6/12/2018 31125
2413 5/03/2019 5183
2418 21/01/2019 3158
2418 23/01/2019 3127
2418 24/01/2019 7878
2418 21/01/2019 7894
2418 31/01/2019 7895
2418 6/03/2019 4548
I want to return the customers table, with another column showing the most recent meeting and its meeting_id, as below:我想返回客户表,另一列显示最近的会议及其会议 ID,如下所示:
id name most_recent most_recent_id
1607 duck 11/03/2019 xxxx
1622 dog 16/11/2018 xxxxx
1972 cat 13/02/2019 xxxx
2204 bird 4/02/2019 etc
2367 fish 22/01/2019
2373 elephant 18/01/2019
2386 moose 18/02/2019
2413 mammal 5/03/2019
2418 man 6/03/2019
22120 goldfish 0
6067 toucan 0
83340 capybara 0
have tried a couple of different ways, by looping through the df's etc, but haven't got anything that works any help appreciated!已经尝试了几种不同的方法,通过循环访问 df 等,但没有任何有用的帮助表示赞赏! thanks.
谢谢。
try this,尝试这个,
df2=df2.drop_duplicates(subset=['customer_id'],keep='last')
pd.merge(df1, df2, left_on=['id'], right_on=['customer_id'], how='left').rename(columns={'date':'most_recent','meeting_id':'most_recent_id'}).drop('customer_id',1).fillna(0)
you just need to remove all the duplicate records from df2 and keep last(latest) record.您只需要从 df2 中删除所有重复记录并保留最后(最新)记录。 then apply left merge.
然后应用左合并。
O/P:开/关:
id name most_recent most_recent_id
0 1607 duck 11/03/2019 5483.0
1 1622 dog 16/11/2018 32125.0
2 1972 cat 13/02/2019 6548.0
3 2204 bird 8/11/2018 8755.0
4 2367 fish 22/01/2019 6545.0
5 2373 elephant 18/01/2019 5448.0
6 2386 moose 18/02/2019 5458.0
7 2413 mammal 5/03/2019 5183.0
8 2418 man 6/03/2019 4548.0
9 22120 goldfish 0 0.0
10 6067 toucan 0 0.0
11 83340 capybara 0 0.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.