complex dataframe filtering request on the last occurence of a value in Panda/Python [EDIT]

Question

I have a hard time to do a complex dataframe filtering.

Here the problem:

For each column 'id' of same value, the column 'job' can take the values 'fireman','nan','policeman'.

I would like to filter my dataframe so that for each id of same value,

I keep only the rows starting where the value 'fireman' for job is occuring the last consecutive time. I first have to group by 'job' values to filter on:

 df.groupby("job").filter(lambda x: f(x))

I don't know which function f is appropriate.

Any ideas ?

To try:

df = pd.DataFrame([[79,1,], [79,2,'fireman'],[79,3,'fireman'],[79,4,],[79,5,],[79,6,'fireman'],[79,7,'fireman'],[79,8,'policeman']], columns=['id','day','job'])


output = pd.DataFrame([[79,6,'fireman'],[79,7,'fireman'],[79,8,'policeman']], columns=['id','day','job'])

Answer 1

Here is a version without the need of extra variables:

df.groupby('imo').apply(lambda grp: grp[grp.index >= 
                                        ((grp.polygon.shift() != grp.polygon) & 
                                         (grp.polygon.shift(-1) == grp.polygon) & 
                                         (grp.polygon == 'FE')
                                        ).cumsum().idxmax()]
                       ).reset_index(level=0, drop=True)

complex dataframe filtering request on the last occurence of a value in Panda/Python [EDIT]

Question

1 answers

solution1
0 ACCPTED 2017-10-20 12:09:25

complex dataframe filtering request on the last occurence of a value in Panda/Python [EDIT]

Question

1 answers

solution1 0 ACCPTED 2017-10-20 12:09:25

solution1
0 ACCPTED 2017-10-20 12:09:25