I have two data frames. The first is a list of email addresses
email_ad Band 30 Band 40
example_email@gmail.com
sample_email@gmail.com
The second is the following data frame
email_ad Name Manager_Name Manager_Band_level
example_email@gmail.com. Tom Banks Boss1 30
sample_email@gmail.com. Bill Bob Boss2 40
I want to map each email in the first dataframe to the second, if the email is a match, then check the Manager Band level. Based off the number, fill that corresponding manager's email into the "Band 30" or "Band 40" categories.
So the desired dataframe is the following:
email_ad Band 30 Band 40
example_email@gmail.com Boss1
sample_email@gmail.com. Boss2
if someone could please help me for syntax that would be amazing. I was additionally stuck making a dictionary out of only certain columns in the dataframe. Thanks so much!!
You can do this with a pivot.
pivoted = df.pivot('email_ad', 'Manager_Band_level', 'Manager_Name')
pivoted
# Manager_Band_level 30 40
# email_ad
# example_email@gmail.com Boss1 NaN
# sample_email@gmail.com NaN Boss2
That format is probably usable as is. But if you want to clean it up a bit more, you can fill in the missing values and flatten it back out.
cleaned = pivoted.fillna('').reset_index()
cleaned
# Manager_Band_level email_ad 30 40
# 0 example_email@gmail.com Boss1
# 1 sample_email@gmail.com Boss2
You can get rid of the awkward, and now incorrect, column index name by resetting the columns
.
cleaned.columns = list(cleaned.columns)
cleaned
# email_ad 30 40
# 0 example_email@gmail.com Boss1
# 1 sample_email@gmail.com Boss2
If the Band
prefix is important, you can add that at the same time.
cleaned.columns = [col if col == 'email_ad' else f'Band {col}' for col in cleaned.columns]
cleaned
# email_ad Band 30 Band 40
# 0 example_email@gmail.com Boss1
# 1 sample_email@gmail.com Boss2
Try using merge and maybe a loop if there are multiple bands:
import pandas as pd
from io import StringIO
s1 = '''
email_ad,Band 30,Band 40
example_email@gmail.com,,
sample_email@gmail.com,,
'''
s2= '''
email_ad,Name,Manager_Name,Manager_Band_level
example_email@gmail.com,Tom Banks,Boss1,30
sample_email@gmail.com,Bill Bob,Boss2,40
'''
df1 = pd.read_csv(StringIO(s1))
df2 = pd.read_csv(StringIO(s2))
And then do the merge
df3 = df1.merge(df2, how='left', left_on='email_ad', right_on='email_ad')
for i in (30,40):
df3.loc[df3.Manager_Band_level == i, f'Band {i}'] = \
df3.loc[df3.Manager_Band_level == i, 'Manager_Name']
df3[['email_ad', 'Band 30', 'Band 40']]
# email_ad Band 30 Band 40
# 0 example_email@gmail.com Boss1 NaN
# 1 sample_email@gmail.com NaN Boss2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.