简体   繁体   中英

How to create a new DF where the last column is multiplied by 3/2 filtering another DF's ID #?

I have two DF one which as

DF1

 Name           Salary     IDnum     Age   City 
 Mike Thanks    52000      542       52    NYC
 Bob  Very      15000      451       21    LA
 Sam  You       72000      556       21    SF

the other DF has just the IDnum where the header is broken by city and Bonus (who get 3/2x) ***in this case only Sam You get a 3/2x bonus plus the employee with the IDnum 134 who somewhere blow my top 3 rows.

DF2

   NYC   LA    SF   Bonus
0  542   451  421   556          
1  745   345  367   134

My goal is to have a new DF where it uses the salary DF1 and ID Num in D2

to get a new DF3

The goal is to have something like this. I very much want to avoid adding it to the first DF is it would create issues such as repeat and date conflicts.

 Name          IDnum    Age   City  Bonus
 Mike Thanks    542      52    NYC  52000
 Bob  Jame      451      21    LA   15000
 Sam  You       556      21    SF   108000

use np.where along with isin to check if the values are present in the column in another df & then do X if it present & Y if it is not.

Method 1: Add as a column to existing df & then move it to a new df

df1['Adj_Salary']= np.where(df1['IDnum'].isin(df2['Bonus']),df1['Salary']*1.5,df1['Salary'] )
df3=df1.join(pd.DataFrame(df1.pop('Adj_Salary').values.tolist(),index=df1.index))
df3.drop('Salary', axis=1,inplace=True)
df3.rename(columns={0:'Bonus'},inplace=True)

Method 2: Add as a column to new df while creating new df using concat()

a=pd.Series(np.where(df1['IDnum'].isin(df2['Bonus']),df1['Salary']*1.5,df1['Salary']))
df3=pd.concat((df1.loc[:, df1.columns != 'Salary'],a.rename('Bonus')),axis=1, join='inner')
Name         IDnum      Age     City    Bonus
Mike Thanks     542     52      NYC     52000.0
Bob Very        451     21      LA      15000.0
Sam You         556     21      SF      108000.0

Simply use the IDs you need from the bonus_df (df2) and filter the salary_df (df1) with it.

Then you just cruise your way to updating the value in the salary by multiplying the bonus percentage on the basic (which is btw is super generous, can I apply where you work? JK) :

To do that use isin() :

df3 = df1[df1['IDnum'].isin(df2['Bonus'].values.tolist())]      # just get the employees in df1 whose ids exist in df2
df3.reset_index(inplace = True, drop = True)                    # You need to reset the index, since we are updating columns, if you don't update it you would perform update on a slice of dataframe which tends to give warning, we don't want warning now do we? :D
df3['Bonus'] = df3['Salary']*(3/2)                              # Create the bonus field
del df3['Salary']                                               # Delete the salary field if you don't want it in your final df

And voila, that's your desired DataFrame.

Hope this helps :))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM