简体   繁体   中英

How to create a column that identifies the number of rows until the next occurrence of a value in another column with pandas?

I'm trying to ascertain how I can create a column that identifies the number of rows until the next occurrence of a value in another column with pandas that in essence performs the following functionality:

rowid  event   countdown
1      False   NaT
2      True    0 # resets countdown
3      False   1
4      False   2
5      True    0 # resets countdown
6      False   1

In which the event column defines whether or not an event in a column occurs (True) or not (False). And the countdown column identifies the number of subsequent rows/steps that have to occur until said event occurs. I have tried the following:

y['block'] = (y['event'] != y['event'].shift(1)).astype(int).cumsum()
y['countdown'] = y.groupby('block').transform(lambda x: range(1, len(x) + 1))

but it seems grossly inefficient for the operation and doesn't necessarily perform the operation as described, identifying periods as groups instead of a simple rollout.

Does anyone know how I can succinctly accomplish this, thanks!

What I will do cumcount

df.groupby(df.event.cumsum()).cumcount()
Out[46]: 
0    0
1    0
2    1
3    2
4    0
5    1
dtype: int64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM