简体   繁体   English

Select 基于列值的一行及其前 2 行

[英]Select a row based on column value and its previous 2 rows

+---+---+---+---+----+
| A | B | C | D | E  |
+---+---+---+---+----+
| 1 | 2 | 3 | 4 | VK |
| 1 | 4 | 6 | 9 | MD |
| 2 | 5 | 7 | 9 | V  |
| 2 | 3 | 5 | 8 | VK |
| 2 | 3 | 7 | 9 | V  |
| 1 | 1 | 1 | 1 | N  |
| 0 | 1 | 6 | 9 | V  |
| 1 | 2 | 5 | 7 | VK |
| 1 | 7 | 8 | 0 | MD |
| 1 | 5 | 7 | 9 | VK |
| 0 | 1 | 6 | 8 | V  |
+---+---+---+---+----+

i want to select a row based on column value and its two previous rows.我想根据列值及其前两行 select 一行。 For example in the given dataset (on the picture) I want to select row based on 'E' column value 'VK' and two previous rows of that selected row.例如,在给定的数据集(在图片上)我想 select 行基于“E”列值“VK”和该选定行的前两行。 So we should get a dataset like this:所以我们应该得到一个这样的数据集:

+---+---+---+---+----+
| A | B | C | D | E  |
+---+---+---+---+----+
| 1 | 2 | 3 | 4 | VK |
| 1 | 4 | 6 | 9 | MD |
| 2 | 5 | 7 | 9 | V  |
| 2 | 3 | 5 | 8 | VK |
| 2 | 3 | 7 | 9 | V  |
| 1 | 1 | 1 | 1 | N  |
| 1 | 2 | 5 | 7 | VK |
| 1 | 7 | 8 | 0 | MD |
| 1 | 5 | 7 | 9 | VK |
+---+---+---+---+----+

1st we need filter the dataframe until the last VK, then create the groupkey with cumsum , then do groupby head首先,我们需要过滤 dataframe 直到最后一个 VK,然后使用 cumsum 创建cumsum ,然后进行groupby head

df=df.loc[:df.E.eq('VK').loc[lambda x : x].index.max()]
df=df.iloc[::-1].groupby(df.E.eq('VK').iloc[::-1].cumsum()).head(3).sort_index()
df
Out[102]: 
   A  B  C  D   E
0  1  2  3  4  VK
1  1  4  6  9  MD
2  2  5  7  9   V
3  2  3  5  8  VK
5  1  1  1  1   N
6  0  1  6  9   V
7  1  2  5  7  VK
8  1  7  8  0  MD
9  1  5  7  9  VK

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM