简体   繁体   中英

Setting value for dataframe column from another dataframe based on condition

I have one dataframe

#Around 100000 rows
df = pd.DataFrame({'text':    [ 'Apple is healthy',  'Potato is round', 'Apple might be green'],
                   'category': ["","", ""],
                   })

A second dataframe

#Around 3000 rows
df_2 = pd.DataFrame({'keyword':    [ 'Apple ',  'Potato'],
                   'category': ["fruit","vegetable"],
                   })

The required result

#Around 100000 rows
df = pd.DataFrame({'text':    [ 'Apple is healthy',  'Potato is round', 'Apple might be green'],
                   'category': ["fruit","vegetable", "fruit"],
                   })

I tried this currently

df.set_index('text')
df_2.set_index('keyword')
df.update(df_2)

The result is

text    category
Apple is healthy    fruit
Potato is round vegetable
Apple might be green

AS you can see it does not add category for last row. How can I achieve that?

You need assign back output from DataFrame.set_index , because not inplace operation like DataFrame.update , for matching is used Series.str.extract by column df_2["keyword"] :

df = df.set_index(df['text'].str.extract(f'({"|".join(df_2["keyword"])})', expand=False))
df_2 = df_2.set_index('keyword')
print (df)
                        text category
text                                 
Apple       Apple is healthy         
Potato       Potato is round         
Apple   Apple might be green  



df.update(df_2)
print (df)
                        text   category
text                                   
Apple       Apple is healthy      fruit
Potato       Potato is round  vegetable
Apple   Apple might be green      fruit

If need add only one column use Series.str.extract with Series.map :

s = df['text'].str.extract(f'({"|".join(df_2["keyword"])})', expand=False)
df['category'] = s.map(df_2.set_index(['keyword'])['category'])
print (df)
                   text   category
0      Apple is healthy      fruit
1       Potato is round  vegetable
2  Apple might be green      fruit

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM