简体   繁体   中英

How to get previous rows of a pandas GroupedBy Dataframe based on a condition on the current row?

I have a dataframe like this:

StringCol Timestamp GroupID Flag
   xyz    20170101   123     yes
   abc    20170101   123     yes
   def    20170101   123     yes
   ghi    20170101   123     no
   abc    20170101   124     yes
   jkl    20170101   124     yes
   pqr    20170101   124     no
   klm    20170101   124     yes

I want to group this by the GroupID, and for each group, I want the rows that have flag as "no" and X number of previous rows before it (the dataframe is sorted by GroupID and Timestamp already).

So, if X = 2, I want the result to be something like:

StringCol Timestamp GroupID Flag
   abc    20170101   123     yes
   def    20170101   123     yes
   ghi    20170101   123     no
   abc    20170101   124     yes
   jkl    20170101   124     yes
   pqr    20170101   124     no

How do I achieve this? Thanks.

This gets the previous X items for the last flag per group.

def prevK(x):
    i = x.reset_index(drop=True).Flag.eq('no').iloc[::-1].idxmax()
    return x.iloc[i - 2:i + 1, :]

df.groupby('GroupID', group_keys=False).apply(prevK)

  StringCol  Timestamp  GroupID Flag
1       abc   20170101      123  yes
2       def   20170101      123  yes
3       ghi   20170101      123   no
4       abc   20170101      124  yes
5       jkl   20170101      124  yes
6       pqr   20170101      124   no

If you only need last no in the group try drop_duplicates

df1=df.copy()
df=df[df['Flag'].eq('no')].drop_duplicates(['GroupID'],keep='last')

idx=df.index+1
idy=df.index-2
import itertools
df1.loc[list(itertools.chain(*[list(range(y,x)) for x , y in  zip(idx,idy)]))]
Out[512]: 
  StringCol  Timestamp  GroupID Flag
1       abc   20170101      123  yes
2       def   20170101      123  yes
3       ghi   20170101      123   no
4       abc   20170101      124  yes
5       jkl   20170101      124  yes
6       pqr   20170101      124   no

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM