Pandas Dataframe select row based on a condition and the previous N rows that are previous the condition

Question

I have a dataframe and I want to select the rows based on a condition and the previous N rows that are previous the condition.

Example:

pd.DataFrame({'value':[10,20,30,40,50,60,70,80,90],'is_fishing':['NO','NO','YES','NO','YES','NO','NO','NO','YES']})

     value     is_fishing
0     10         NO
1     20         NO
2     30        YES
3     40         NO
4     50        YES
5     60         NO
6     70         NO
7     80         NO
8     90        YES

Expected with N=1 and condition is_fishing=='YES'

     value     is_fishing
1     20         NO
2     30        YES
3     40         NO
4     50        YES
7     80         NO
8     90        YES

Answer 1

Numpy's `split`

def n_prior_to_condition(df, n, condition):
    i = np.flatnonzero(condition) + 1
    return pd.concat([d.tail(n+1) for d in np.split(df, i)])

n_prior_to_condition(df, 1, df.is_fishing=="YES")

   value is_fishing
1     20         NO
2     30        YES
3     40         NO
4     50        YES
7     80         NO
8     90        YES

`groupby`

def n_prior_to_condition(df, n, condition):
    groups = condition.iloc[::-1].cumsum()
    return df.groupby(groups).tail(n+1)

n_prior_to_condition(df, 1, df.is_fishing=="YES")

   value is_fishing
1     20         NO
2     30        YES
3     40         NO
4     50        YES
7     80         NO
8     90        YES

Pandas Dataframe select row based on a condition and the previous N rows that are previous the condition

Question

1 answers

solution1
0 ACCPTED 2022-04-04 23:09:48

Numpy's `split`

`groupby`

Pandas Dataframe select row based on a condition and the previous N rows that are previous the condition

Question

1 answers

solution1 0 ACCPTED 2022-04-04 23:09:48

Numpy's split

groupby

solution1
0 ACCPTED 2022-04-04 23:09:48

Numpy's `split`

`groupby`