![](/img/trans.png)
[英]Select rows by column value and include previous row by another column value
[英]Select a row based on column value and its previous 2 rows
+---+---+---+---+----+
| A | B | C | D | E |
+---+---+---+---+----+
| 1 | 2 | 3 | 4 | VK |
| 1 | 4 | 6 | 9 | MD |
| 2 | 5 | 7 | 9 | V |
| 2 | 3 | 5 | 8 | VK |
| 2 | 3 | 7 | 9 | V |
| 1 | 1 | 1 | 1 | N |
| 0 | 1 | 6 | 9 | V |
| 1 | 2 | 5 | 7 | VK |
| 1 | 7 | 8 | 0 | MD |
| 1 | 5 | 7 | 9 | VK |
| 0 | 1 | 6 | 8 | V |
+---+---+---+---+----+
我想根據列值及其前兩行 select 一行。 例如,在給定的數據集(在圖片上)我想 select 行基於“E”列值“VK”和該選定行的前兩行。 所以我們應該得到一個這樣的數據集:
+---+---+---+---+----+
| A | B | C | D | E |
+---+---+---+---+----+
| 1 | 2 | 3 | 4 | VK |
| 1 | 4 | 6 | 9 | MD |
| 2 | 5 | 7 | 9 | V |
| 2 | 3 | 5 | 8 | VK |
| 2 | 3 | 7 | 9 | V |
| 1 | 1 | 1 | 1 | N |
| 1 | 2 | 5 | 7 | VK |
| 1 | 7 | 8 | 0 | MD |
| 1 | 5 | 7 | 9 | VK |
+---+---+---+---+----+
首先,我們需要過濾 dataframe 直到最后一個 VK,然后使用 cumsum 創建cumsum
,然后進行groupby
head
df=df.loc[:df.E.eq('VK').loc[lambda x : x].index.max()]
df=df.iloc[::-1].groupby(df.E.eq('VK').iloc[::-1].cumsum()).head(3).sort_index()
df
Out[102]:
A B C D E
0 1 2 3 4 VK
1 1 4 6 9 MD
2 2 5 7 9 V
3 2 3 5 8 VK
5 1 1 1 1 N
6 0 1 6 9 V
7 1 2 5 7 VK
8 1 7 8 0 MD
9 1 5 7 9 VK
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.