简体   繁体   中英

How to return a value in a column based on another's dataframe's values

I am trying to calculate a column in a dataframe based on another data frame. It is used to calculate seniority bonus of a HR payroll.

the two dataframes are:

df1 = headcount

peopleID    peopleSeniority
1               2
2               6
3               12
4               30

df2 = seniority_bonus

seniority    seniorityBonus
5            500
10           1000
15           2000 
20           3000

and I would like to write a script that return df2 Seniority bonus according to df1 people's seniority. Something that will do: if df1['people seniority'] > df2['seniority'] >, df2['senioritybonus'] like if df1 seniority in years > 5, 500 if df1 seniority in years > 10, 2000...

I've tried to use query but it is not working, I do not know how to write a loop that would be able to make the calculation.

Would anyone have an idea?

It is easier if the bonus always increases every five years. Add a column to df1 then merge with df2 on that column

df1['seniority'] = (df1.peopleSeniority // 5) *5
df1 = df1.merge(df2, on='seniority')

Read more about merge in the docs and this Q&A

Edit

If the increments in years between brackets are not even, nor linear, nor any other algebraic function, then you can use pd.cut() to categorize each person's seniority.

df1['seniority_bonus'] = pd.cut(df.peopleSeniority,
    bins=[0, 5, 8, 11, 15, 21, 30], right=False, labels=[0, 500, 1000, 1500, 2000, 2500])

Notice how labels have one less item than bins. That is because with a 7 integer list you can only build 6 categories (there will not be a category for 30-infinite).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM