简体   繁体   中英

Python2.7: FIlter out group from dataframe based on condition in groupby

I have a dataframe and I would like to filter the dataframe further to only include a group whose rows do not have a certain value in a column

For eg, in the dataframe, since hamilton has an overtake in lap3 of his stint 1, I want to remove ALL of hamilton's stint 1 laptime records from the dataframe below.

I thought of doing a groupby and then a get group,iterate through each row in the group, detect non-null value in the "clear lap?" column, and label "yes" in a new column for all rows in the groupby, then filter out the group.

Is there a faster way of subsetting the dataframe?

Dataframe:

    name                   driverRef stint  tyre      lap   pos     clear lap?
0   Australian Grand Prix   vettel  1.0     Super soft  2   1        NaN
1   Australian Grand Prix   vettel  1.0     Super soft  3   1        NaN
2   Australian Grand Prix   vettel  1.0     Super soft  4   1        NaN
3   Australian Grand Prix   ham     1.0     Super soft  2   3        NaN
4   Australian Grand Prix   ham     1.0     Super soft  3   2        overtook
5   Australian Grand Prix   ham     1.0     Super soft  4   2        NaN

I believe you need get all groups by filtering and then filter again by isin :

Notice: Thank you, @Vivek Kalyanarangan for improvement by unique .

a = df.loc[df['clear lap?'].notnull(), 'driverRef'].unique()
print (a)
['ham']

df = df[~df['driverRef'].isin(a)]
print (df)
                    name driverRef  stint        tyre  lap  pos clear lap?
0  Australian Grand Prix    vettel    1.0  Super soft    2    1        NaN
1  Australian Grand Prix    vettel    1.0  Super soft    3    1        NaN
2  Australian Grand Prix    vettel    1.0  Super soft    4    1        NaN

Another solution, slowier:

df = df[df['clear lap?'].isnull().groupby(df['driverRef']).transform('all')]

Or slowiest:

df = df.groupby('driverRef').filter(lambda x: x['clear lap?'].isnull().all())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM