[英]Replace column value in one Panda Dataframe with column in another Panda Dataframe with conditions
I have the following 3 Panda Dataframe.我有以下 3 个熊猫数据框。 I want to replace company and division columns the ID from their respective company and division dataframe.我想将公司和部门列替换为各自公司和部门数据框中的 ID。
pd_staff:
id name company division
P001 John Sunrise Headquarter
P002 Jane Falcon Digital Research & Development
P003 Joe Ashford Finance
P004 Adam Falcon Digital Sales
P004 Barbara Sunrise Human Resource
pd_company:
id name
1 Sunrise
2 Falcon Digital
3 Ashford
pd_division:
id name
1 Headquarter
2 Research & Development
3 Finance
4 Sales
5 Human Resource
This is the end result that I am trying to produce这是我试图产生的最终结果
id name company division
P001 John 1 1
P002 Jane 2 2
P003 Joe 3 3
P004 Adam 2 4
P004 Barbara 1 5
I have tried to combine Staff and Company using this code我尝试使用此代码将员工和公司结合起来
pd_staff.loc[pd_staff['company'].isin(pd_company['name']), 'company'] = pd_company.loc[pd_company['name'].isin(pd_staff['company']), 'id']
which produces产生
id name company
P001 John 1.0
P002 Jane NaN
P003 Joe NaN
P004 Adam NaN
P004 Barbara NaN
You can do:你可以做:
pd_staff['company'] = pd_staff['company'].map(pd_company.set_index('name')['id'])
pd_staff['division'] = pd_staff['division'].map(pd_division.set_index('name')['id'])
print(pd_staff):打印(pd_staff):
id name company division
0 P001 John 1 1
1 P002 Jane 2 2
2 P003 Joe 3 3
3 P004 Adam 2 4
4 P004 Barbara 1 5
This will achieve the desired results这将达到预期的结果
df_merge = df.merge(df2, how = 'inner', right_on = 'name', left_on = 'company', suffixes=('', '_y'))
df_merge = df_merge.merge(df3, how = 'inner', left_on = 'division', right_on = 'name', suffixes=('', '_z'))
df_merge = df_merge[['id', 'name', 'id_y', 'id_z']]
df_merge.columns = ['id', 'name', 'company', 'division']
df_merge.sort_values('id')
first, lets modify df company and df division a little bit首先,让我们稍微修改一下 df 公司和 df 部门
df2.rename(columns={'name':'company'},inplace=True)
df3.rename(columns={'name':'division'},inplace=True)
Then然后
df1=df1.merge(df2,on='company',how='left').merge(df3,on='division',how='left')
df1=df1[['id_x','name','id_y','id']]
df1.rename(columns={'id_x':'id','id_y':'company','id':'division'},inplace=True)
Use apply, you can have a function thar will replace the values.使用 apply,你可以有一个函数 thar 将替换这些值。 from the second excel you will pass the field to look up to and what's to replace in this.从第二个 excel 中,您将传递要查找的字段以及要替换的内容。 Here I am replacing Sunrise by 1 because it is in the second excel.在这里,我将 Sunrise 替换为 1,因为它在第二个 excel 中。
import pandas as pd
df = pd.read_excel('teste.xlsx')
df2 = pd.read_excel('ids.xlsx')
def altera(df33, field='Sunrise', new_field='1'): # for showing pourposes I left default values but they are to pass from the second excel
return df33.replace(field, new_field)
df.loc[:, 'company'] = df['company'].apply(altera)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.