I have a dataframe which has a column "Purchase". After cleaning up that column, I get values like these
>> for x in range(0 , 19):
>> print (df['PURCHASE'][x].split(start)[1].split(end)[0])
WEBSITE1 *XBEMHM6U52
WEBSITE2.COM/BILL
WEBSITE1 *S3BKFMWFB2
XYZ*WEBSITE3
WEBSITE4
I have a separate dataframe like this
>> icons_df = pd.DataFrame(columns = ['WEBSITE_NAME','ICON'])
>> icons_df
MERCHANT_TXN_NAME ICON
0 WEBSITE1 ..icons\site1.png
1 WESBITE2 ..icons\site2.jpg
2 WEBSITE3 ..icons\site3.png
I want to add a column Icon in df which will have values based on the website name given in purchase. How can I compare website name in PURCHASE against icons_df and assign an icon?
First compile a regex pattern of your website names, then use Series.str.findall
to find the matching website and do a merge
after:
import pandas as pd
df = pd.DataFrame({"name":["WEBSITE1 *XBEMHM6U52","WEBSITE2.COM/BILL","WEBSITE1 *S3BKFMWFB2","XYZ*WEBSITE3","WEBSITE4"]})
icons_df = pd.DataFrame({'WEBSITE_NAME': ["WEBSITE1","WEBSITE2","WEBSITE3"],
"ICON":["1.png","2.jpg","3.png"]})
web_names = icons_df["WEBSITE_NAME"].tolist()
df["WEBSITE_NAME"] = df["name"].str.findall("|".join(web_names)).str[0]
print (df.merge(icons_df,on="WEBSITE_NAME",how="left").drop("WEBSITE_NAME",axis=1))
#
name ICON
0 WEBSITE1 *XBEMHM6U52 1.png
1 WEBSITE2.COM/BILL 2.jpg
2 WEBSITE1 *S3BKFMWFB2 1.png
3 XYZ*WEBSITE3 3.png
4 WEBSITE4 NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.