简体   繁体   中英

How to transform a column value of a dataframe with values of another dataframe columns

How to add a column "main_category" to orig_diff which will indicate under which main category does the sub-category belong to. For instance orig_df with value "Movie" must have the "main_category" as "Entertainment" and "Maths" as "Education".

import pandas as pd
import numpy as np

orig_df = pd.DataFrame({"sub_cat" : ["Movie", "Science", "Maths", "Music", "Songs", "Dance", "English", "Maths", "Songs"], "Student": ["Stud1", "Stud2", "Stud3", "Stud4", "Stud5", "Stud6", "Stud7", "Sud8", "Stud9"]})
sub_df = pd.DataFrame({"Education": [0,1,1,0,0,0,1], "Entertainment": [1,0,0,1,1,1,0]}, index=["Movie", "Science", "Maths", "Music", "Songs", "Dance", "English"])
print(orig_df)
print(sub_df)

One way is to create a dictionary from sub_df by iterating rows.

Then use dictionary as map on orig_df['sub_cat'] :

d = {idx: next(k for k in sub_df if row[k]==1)
     for idx, row in sub_df.iterrows()}

orig_df['main_category'] = orig_df['sub_cat'].map(d)

print(orig_df)

  Student  sub_cat  main_category
0   Stud1    Movie  Entertainment
1   Stud2  Science      Education
2   Stud3    Maths      Education
3   Stud4    Music  Entertainment
4   Stud5    Songs  Entertainment
5   Stud6    Dance  Entertainment
6   Stud7  English      Education
7    Sud8    Maths      Education
8   Stud9    Songs  Entertainment

Note this assumes that each sub_cat only maps to one of "Education" or "Entertainment."

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM