How to return a value in a column based on another's dataframe's values

Question

I am trying to calculate a column in a dataframe based on another data frame. It is used to calculate seniority bonus of a HR payroll.

the two dataframes are:

df1 = headcount

peopleID    peopleSeniority
1               2
2               6
3               12
4               30

df2 = seniority_bonus

seniority    seniorityBonus
5            500
10           1000
15           2000 
20           3000

and I would like to write a script that return df2 Seniority bonus according to df1 people's seniority. Something that will do: if df1['people seniority'] > df2['seniority'] >, df2['senioritybonus'] like if df1 seniority in years > 5, 500 if df1 seniority in years > 10, 2000...

I've tried to use query but it is not working, I do not know how to write a loop that would be able to make the calculation.

Would anyone have an idea?

Answer 1

It is easier if the bonus always increases every five years. Add a column to df1 then merge with df2 on that column

df1['seniority'] = (df1.peopleSeniority // 5) *5
df1 = df1.merge(df2, on='seniority')

Read more about merge in the docs and this Q&A

Edit

If the increments in years between brackets are not even, nor linear, nor any other algebraic function, then you can use pd.cut() to categorize each person's seniority.

df1['seniority_bonus'] = pd.cut(df.peopleSeniority,
    bins=[0, 5, 8, 11, 15, 21, 30], right=False, labels=[0, 500, 1000, 1500, 2000, 2500])

Notice how labels have one less item than bins. That is because with a 7 integer list you can only build 6 categories (there will not be a category for 30-infinite).

How to return a value in a column based on another's dataframe's values

Question

1 answers

solution1
0 2020-08-03 10:59:22

How to return a value in a column based on another's dataframe's values

Question

1 answers

solution1 0 2020-08-03 10:59:22

solution1
0 2020-08-03 10:59:22