I am trying to calculate a column in a dataframe based on another data frame. It is used to calculate seniority bonus of a HR payroll.
the two dataframes are:
df1 = headcount
peopleID peopleSeniority
1 2
2 6
3 12
4 30
df2 = seniority_bonus
seniority seniorityBonus
5 500
10 1000
15 2000
20 3000
and I would like to write a script that return df2 Seniority bonus according to df1 people's seniority. Something that will do: if df1['people seniority'] > df2['seniority'] >, df2['senioritybonus'] like if df1 seniority in years > 5, 500 if df1 seniority in years > 10, 2000...
I've tried to use query but it is not working, I do not know how to write a loop that would be able to make the calculation.
Would anyone have an idea?
It is easier if the bonus always increases every five years. Add a column to df1 then merge with df2 on that column
df1['seniority'] = (df1.peopleSeniority // 5) *5
df1 = df1.merge(df2, on='seniority')
Read more about merge in the docs and this Q&A
Edit
If the increments in years between brackets are not even, nor linear, nor any other algebraic function, then you can use pd.cut()
to categorize each person's seniority.
df1['seniority_bonus'] = pd.cut(df.peopleSeniority,
bins=[0, 5, 8, 11, 15, 21, 30], right=False, labels=[0, 500, 1000, 1500, 2000, 2500])
Notice how labels have one less item than bins. That is because with a 7 integer list you can only build 6 categories (there will not be a category for 30-infinite).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.