I have two DF one which as
DF1
Name Salary IDnum Age City
Mike Thanks 52000 542 52 NYC
Bob Very 15000 451 21 LA
Sam You 72000 556 21 SF
the other DF has just the IDnum where the header is broken by city and Bonus (who get 3/2x) ***in this case only Sam You get a 3/2x bonus plus the employee with the IDnum 134 who somewhere blow my top 3 rows.
DF2
NYC LA SF Bonus
0 542 451 421 556
1 745 345 367 134
My goal is to have a new DF where it uses the salary DF1 and ID Num in D2
to get a new DF3
The goal is to have something like this. I very much want to avoid adding it to the first DF is it would create issues such as repeat and date conflicts.
Name IDnum Age City Bonus
Mike Thanks 542 52 NYC 52000
Bob Jame 451 21 LA 15000
Sam You 556 21 SF 108000
use np.where along with isin to check if the values are present in the column in another df & then do X if it present & Y if it is not.
Method 1: Add as a column to existing df & then move it to a new df
df1['Adj_Salary']= np.where(df1['IDnum'].isin(df2['Bonus']),df1['Salary']*1.5,df1['Salary'] )
df3=df1.join(pd.DataFrame(df1.pop('Adj_Salary').values.tolist(),index=df1.index))
df3.drop('Salary', axis=1,inplace=True)
df3.rename(columns={0:'Bonus'},inplace=True)
Method 2: Add as a column to new df while creating new df using concat()
a=pd.Series(np.where(df1['IDnum'].isin(df2['Bonus']),df1['Salary']*1.5,df1['Salary']))
df3=pd.concat((df1.loc[:, df1.columns != 'Salary'],a.rename('Bonus')),axis=1, join='inner')
Name IDnum Age City Bonus
Mike Thanks 542 52 NYC 52000.0
Bob Very 451 21 LA 15000.0
Sam You 556 21 SF 108000.0
Simply use the IDs
you need from the bonus_df (df2) and filter the salary_df (df1) with it.
Then you just cruise your way to updating the value in the salary by multiplying the bonus percentage on the basic (which is btw is super generous, can I apply where you work? JK) :
To do that use isin()
:
df3 = df1[df1['IDnum'].isin(df2['Bonus'].values.tolist())] # just get the employees in df1 whose ids exist in df2
df3.reset_index(inplace = True, drop = True) # You need to reset the index, since we are updating columns, if you don't update it you would perform update on a slice of dataframe which tends to give warning, we don't want warning now do we? :D
df3['Bonus'] = df3['Salary']*(3/2) # Create the bonus field
del df3['Salary'] # Delete the salary field if you don't want it in your final df
And voila, that's your desired DataFrame.
Hope this helps :))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.