[英]Replace values in a column based on another dataframe
I have a table:我有一张桌子:
Name Profession Character
Ben cinematographer Nan
Scarlett actress Black Widow
Robert actor Iron Man
Chris actor Thor
Kevin producer Nan
I created a new data frame with a column of unique values sorted in ascending order from the table above and an incremental column我创建了一个新数据框,其中包含一列从上表升序排列的唯一值和一个增量列
ID Job
1 actor
2 actress
3 cinematographer
4 producer
Now i need to replace the values in the profession column in the original table with their corresponding ID from the new table Desired Output现在我需要用新表 Desired Output 中的相应 ID 替换原始表中的专业列中的值
Name Profession Character
Ben 3 Nan
Scarlett 2 Black Widow
Robert 1 Iron Man
Chris 1 Thor
Kevin 4 Nan
code so far
df=pdf.read_csv(filename)
column = df['Profession'].unique()
new_df=pd.DataFrame(column, columns=['Job])
new_df=new_df.sort_values(['Job'])
new_df = new_df.reset_index()
new_df.columns.values[0] = 'ID'
new_df['ID'] = new_df.index + 1
df.loc[df['Profession] == new_df['Job'], 'Profession'] = new_df['ID']
The last line yeilds 'ValueError: Can only compare identically-labeled Series objects'
Try with replace
then然后尝试
replace
df1.Profession = df1.Profession.replace(df2.set_index('Job').ID)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.