[英]How to transform a column value of a dataframe with values of another dataframe columns
How to add a column "main_category" to orig_diff which will indicate under which main category does the sub-category belong to. 如何在orig_diff中添加列“ main_category”,该列将指示子类别属于哪个主要类别。 For instance orig_df with value "Movie" must have the "main_category" as "Entertainment" and "Maths" as "Education".
例如,值“电影”的orig_df必须将“ main_category”作为“娱乐”,将“ Maths”作为“教育”。
import pandas as pd
import numpy as np
orig_df = pd.DataFrame({"sub_cat" : ["Movie", "Science", "Maths", "Music", "Songs", "Dance", "English", "Maths", "Songs"], "Student": ["Stud1", "Stud2", "Stud3", "Stud4", "Stud5", "Stud6", "Stud7", "Sud8", "Stud9"]})
sub_df = pd.DataFrame({"Education": [0,1,1,0,0,0,1], "Entertainment": [1,0,0,1,1,1,0]}, index=["Movie", "Science", "Maths", "Music", "Songs", "Dance", "English"])
print(orig_df)
print(sub_df)
One way is to create a dictionary from sub_df
by iterating rows. 一种方法是通过迭代行从
sub_df
创建字典。
Then use dictionary as map on orig_df['sub_cat']
: 然后使用字典作为
orig_df['sub_cat']
上的地图:
d = {idx: next(k for k in sub_df if row[k]==1)
for idx, row in sub_df.iterrows()}
orig_df['main_category'] = orig_df['sub_cat'].map(d)
print(orig_df)
Student sub_cat main_category
0 Stud1 Movie Entertainment
1 Stud2 Science Education
2 Stud3 Maths Education
3 Stud4 Music Entertainment
4 Stud5 Songs Entertainment
5 Stud6 Dance Entertainment
6 Stud7 English Education
7 Sud8 Maths Education
8 Stud9 Songs Entertainment
Note this assumes that each sub_cat
only maps to one of "Education" or "Entertainment." 请注意,这假定每个
sub_cat
仅映射到“教育”或“娱乐”之一。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.