I have the following dataframe that represents the employee number, the department they are and their code.
Department Name Employee Number Role Code
0 Dept1 1000 1
1 Dept1 1000 2
2 Dept2 1000 2
4 Dept3 1000 2
5 Dept4 1000 1
0 Dept1 1001 1
1 Dept2 1001 1
2 Dept2 1001 2
4 Dept3 1001 1
5 Dept3 1001 2
I need to filter this dataframe in a way that each employee can have only the code 1 OR the code 2 in each unique department. If they have both roles in the same department, return both rows, so this output:
Department Name Employee Number Role Code
0 Dept1 1000 1
1 Dept1 1000 2
1 Dept2 1001 1
2 Dept2 1001 2
4 Dept3 1001 1
5 Dept3 1001 2
What would be the best way to do that?
Try this:
df.groupby(['Department Name','Employee Number']).filter(lambda x: x['Role Code'].nunique() == 2)
Department Name Employee Number Role Code
0 Dept1 1000 1
1 Dept1 1000 2
6 Dept2 1001 1
7 Dept2 1001 2
8 Dept3 1001 1
9 Dept3 1001 2
Let's try groupby().nunique()
:
mask = df.groupby(['Department Name','Employee Number'])['Role Code'].transform('nunique')
df[mask==2]
Output:
Department Name Employee Number Role Code
0 Dept1 1000 1
1 Dept1 1000 2
1 Dept2 1001 1
2 Dept2 1001 2
4 Dept3 1001 1
5 Dept3 1001 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.