简体   繁体   中英

Python2.7: Subset dataframe based on condition in first row of groupby

I would like to subset a pandas dataframe based on a condition which only the first row in the groupby is subjected to.

Dataframe is to be grouped by "name", "driverRef", "tyre", "stint"

For eg, in the df below, because alonso started his stint 2 in position 12, i want to remove all of alonso's records from the df.

    name                   driverRef stint  tyre      lap   pos     
0   Australian Grand Prix   alonso  1.0     Super soft  1   9        
1   Australian Grand Prix   alonso  1.0     Super soft  2   9        
2   Australian Grand Prix   alonso  1.0     Super soft  3   9       
3   Australian Grand Prix   alonso  2.0     Super soft  20   12        
4   Australian Grand Prix   alonso  2.0     Super soft  21   11     
5   Australian Grand Prix   alonso  2.0     Super soft  22   10       

Expected output:

    name                   driverRef stint  tyre      lap   pos     
0   Australian Grand Prix   alonso  1.0     Super soft  2   9        
1   Australian Grand Prix   alonso  1.0     Super soft  3   9        
2   Australian Grand Prix   alonso  1.0     Super soft  4   9        

I tried this, but it doesn't implemenent the effect correctly:

df.loc[df.groupby(['name', 'driverRef', 'tyre', 'stint']).first().reset_index()['position'].isin(list(range(1,11))).index]

EDIT: My code does work, but please see @jezrael's answer for a more succint/better way of writing.

You are really close, need transform for return Series with same length as original df :

s = df.groupby(['name', 'driverRef', 'tyre', 'stint'])['pos'].transform('first')
print (s)
0     9
1     9
2     9
3    12
4    12
5    12
Name: pos, dtype: int64

df = df[s.isin(list(range(1,11)))]
print (df)
                    name driverRef  stint        tyre  lap  pos
0  Australian Grand Prix    alonso    1.0  Super soft    1    9
1  Australian Grand Prix    alonso    1.0  Super soft    2    9
2  Australian Grand Prix    alonso    1.0  Super soft    3    9

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM