I'm trying to drop all group of data when the certain condition is met!
import pandas as pd
raw_data = {'regiment': ['51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st'],
'trucks': ['MAZ-7310', 'MAZ-7310', 'MAZ-7310', 'MAZ-7310', 'Tatra 810', 'Tatra 810', 'Tatra 810', 'Tatra 810', 'ZIS-150', 'ZIS-150', 'ZIS-150', 'ZIS-150'],
'drivers': ['MAZ', 'MAZ', 'IVE', 'IVE', 'MAN', 'MAN', 'MERC', 'TATA', 'TATA', 'MAN', 'REN', 'TATA'],
'counts': [0,0,1,1,0,0,1,0, 1,2,3,4]}
df = pd.DataFrame(raw_data, columns = ['regiment', 'trucks','drivers','counts'])
regiment trucks drivers counts
0 51st MAZ-7310 MAZ 0
1 51st MAZ-7310 MAZ 0
2 51st MAZ-7310 IVE 1
3 51st MAZ-7310 IVE 1
4 51st Tatra 810 MAN 0
5 51st Tatra 810 MAN 0
6 51st Tatra 810 MERC 1
7 51st Tatra 810 TATA 0
8 51st ZIS-150 TATA 1
9 51st ZIS-150 MAN 2
10 51st ZIS-150 REN 3
11 51st ZIS-150 TATA 4
I'm trying to drop the MAZ-7310
group when drivers are MAZ
and counts == 0
So I followed this post Pandas groupby and filter
df = df.groupby(['regiment','trucks']).filter(lambda x: ~((x['counts'] == 0) & (x['drivers'] == 'MAZ')).all())
but it seems that it does not give me the output that I need.
The expected output
regiment trucks drivers counts
4 51st Tatra 810 MAN 0
5 51st Tatra 810 MAN 0
6 51st Tatra 810 MERC 1
7 51st Tatra 810 TATA 0
8 51st ZIS-150 TATA 1
9 51st ZIS-150 MAN 2
10 51st ZIS-150 REN 3
11 51st ZIS-150 TATA 4
How can I get this output ?
thx
First we assign a new column called m
which is a boolean for the rows where drivers is MAZ
and counts is 0
.
Then we use GroupBy
and get all the groups where any m is True
.
Then we use boolean indexing to get the opposite with ~
Methods used:
mask = (df.assign(m=(df['drivers'].eq('MAZ') & ~df['counts']))
.groupby(['regiment','trucks'])['m'].transform('any')
)
df[~mask]
regiment trucks drivers counts
4 51st Tatra 810 MAN 0
5 51st Tatra 810 MAN 0
6 51st Tatra 810 MERC 1
7 51st Tatra 810 TATA 0
8 51st ZIS-150 TATA 1
9 51st ZIS-150 MAN 2
10 51st ZIS-150 REN 3
11 51st ZIS-150 TATA 4
As you desired output, you need to use any
instead of all
. Therefore, just change all
to any
in your code
df_final = df.groupby(['regiment','trucks']).filter(lambda x: ~((x['counts'] ==0)
& (x['drivers'] == 'MAZ')).any())
Out[234]:
regiment trucks drivers counts
4 51st Tatra 810 MAN 0
5 51st Tatra 810 MAN 0
6 51st Tatra 810 MERC 1
7 51st Tatra 810 TATA 0
8 51st ZIS-150 TATA 1
9 51st ZIS-150 MAN 2
10 51st ZIS-150 REN 3
11 51st ZIS-150 TATA 4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.