[英]Python Pandas finds cumulative Min per group
我試圖找到 Desired_Output 列,其定義如下:對於每個名稱和主題組,找到所有先前分數的最小值。
Name Date Subj Score Desired_Output
A 2022-05-11 1200 70.88 69.60
A 2022-03-20 1200 69.96 69.60
A 2022-02-23 1200 69.60 69.63
A 2022-01-26 1200 69.63 70.22
A 2022-01-05 1200 70.35 70.22
A 2021-12-08 1200 70.22 70.69
A 2021-11-17 1000 56.73 null
A 2021-11-10 1200 70.69 null
B 2022-05-07 1600 96.16 96.53
B 2022-04-24 1600 94.53 null
B 2022-03-20 2000 124.60 null
B 2022-02-27 1800 109.16 null
B 2022-02-03 1400 82.54 null
這是數據集:
pd.DataFrame({
'Name': ['A','A','A','A','A','A','A','A','B','B','B','B','B'],
'Date': ['2022-05-11','2022-03-20','2022-02-23','2022-01-26','2022-01-05','2021-12-08','2021-11-17','2021-11-10','2022-05-07','2022-04-24','2022-03-20','2022-02-27','2022-02-03'],
'Subj': [1200,1200,1200,1200,1200,1200,1000,1200,1600,1600,2000,1800,1400],
'Score': [70.88,69.96,69.6,69.63,70.35,70.22,56.73,70.69,96.16,94.53,124.6,109.16,82.54]})
我不知道如何在 Pandas 中實現這一點,尤其是在沒有循環 DataFrame 的情況下。
假設日期以相反的順序排序,您可以使用反轉的cummin
+ 每組shift
:
df['Desired'] = (df[::-1]
.groupby(['Name', 'Subj'])['Score']
.apply(lambda s: s.cummin().shift())
)
Output:
Name Date Subj Score Desired
0 A 2022-05-11 1200 70.88 69.60
1 A 2022-03-20 1200 69.96 69.60
2 A 2022-02-23 1200 69.60 69.63
3 A 2022-01-26 1200 69.63 70.22
4 A 2022-01-05 1200 70.35 70.22
5 A 2021-12-08 1200 70.22 70.69
6 A 2021-11-17 1000 56.73 NaN
7 A 2021-11-10 1200 70.69 NaN
8 B 2022-05-07 1600 96.16 94.53
9 B 2022-04-24 1600 94.53 NaN
10 B 2022-03-20 2000 124.60 NaN
11 B 2022-02-27 1800 109.16 NaN
12 B 2022-02-03 1400 82.54 NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.