companies.xlsx
company To
1 amazon hi@test.de
2 google bye@test.com
3 amazon hi@tld.com
4 starbucks hi@test.de
5 greyhound bye@tuz.de
emails.xlsx
hi@test.de bye@test.com hi@tld.com ...
1 amazon google microsoft
2 starbucks amazon tesla
3 Grey Hound greyhound
4 ferrari
So i have the 2 excel sheets above and read both of em:
file1 = pd.ExcelFile('data/companies.xlsx')
file2 = pd.ExcelFile('data/emails.xlsx')
df_companies = file1.parse('sheet1')
df_emails = file2.parse('sheet1')
what i'm trying to accomplish is:
eg: company amazon has the To email hi@test.de in company.xlsx. in email.xlsx the header hi@test.de exists and also amazon was found in the column - so its a '1'.
Anyone knows how to accomplish this?
Here's one approach. Convert df_emails
to a dictionary and map it to df_companies
. Then, compare the mapped column with df_companies['company']
.
df_companies['check'] = df_companies['To'].map(df_emails.to_dict(orient='list')).fillna('')
df_companies['check'] = df_companies.apply(lambda x: x['company'] in x['check'], axis=1).astype(int)
company To check
1 amazon hi@test.de 1
2 google bye@test.com 1
3 amazon hi@tld.com 0
4 starbucks hi@test.de 1
5 greyhound bye@tuz.de 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.