將最后一組 obs 保留在具有相同（最近）日期的組中

Question

有沒有一種方法可以在一個“組”中只保留最新的觀察結果？

例如，我只想保留每個 PrimaryID-SecondaryID 對的最新觀察結果。

    PrimaryID   SecondaryID     SubAccount  Value   ReportDate
0   1   A   123     5618.48     2022-01-01
1   1   A   456     8206.23     2022-01-01
2   1   A   123     6722.05     2022-07-01
3   1   A   456     5500.53     2022-07-01
4   1   B   789     8990.75     2022-02-01
5   1   B   987     6294.63     2022-02-01
6   1   B   789     8389.60     2022-03-01
7   1   B   246     343.02  2022-03-01
8   2   X   234     4157.57     2022-02-01
9   2   X   752     8218.00     2022-02-01
10  2   X   234     6430.68     2022-03-01
11  2   X   755     7148.57     2022-03-01
12  2   Y   731     5406.63     2022-05-02
13  2   Y   480     2429.83     2022-05-02
14  2   Y   731     6251.38     2022-06-01
15  2   Y   841     8256.93     2022-06-01

這是實現此目的的一種方法，但似乎很草率。

df['lastRptDt'] = df.groupby(['PrimaryID', 'SecondaryID'])['ReportDate'].transform(max)
df1 = df[(df['ReportDate']==df['lastRptDt'])]

這是所需的輸出：

    PrimaryID   SecondaryID     SubAccount  Value   ReportDate  lastRptDt
2   1   A   123     6722.05     2022-07-01  2022-07-01
3   1   A   456     5500.53     2022-07-01  2022-07-01
6   1   B   789     8389.60     2022-03-01  2022-03-01
7   1   B   246     343.02  2022-03-01  2022-03-01
10  2   X   234     6430.68     2022-03-01  2022-03-01
11  2   X   755     7148.57     2022-03-01  2022-03-01
14  2   Y   731     6251.38     2022-06-01  2022-06-01
15  2   Y   841     8256.93     2022-06-01  2022-06-01

Answer 1

這個怎么樣？

df.set_index(['PrimaryID', 'SecondaryID', 'ReportDate']).loc[:,:,df.groupby(['PrimaryID', 'SecondaryID']).ReportDate.max()]

Out[54]: 
                                  SubAccount    Value  lastRptDt
PrimaryID SecondaryID ReportDate                                
1         A           2022-07-01         123  6722.05 2022-07-01
                      2022-07-01         456  5500.53 2022-07-01
          B           2022-03-01         789  8389.60 2022-03-01
                      2022-03-01         246   343.02 2022-03-01
2         X           2022-03-01         234  6430.68 2022-03-01
                      2022-03-01         755  7148.57 2022-03-01
          Y           2022-06-01         731  6251.38 2022-06-01
                      2022-06-01         841  8256.93 2022-06-01

將最后一組 obs 保留在具有相同（最近）日期的組中

問題描述

1 個解決方案

解決方案1
0 2022-07-01 01:05:57

將最后一組 obs 保留在具有相同（最近）日期的組中

問題描述

1 個解決方案

解決方案1 0 2022-07-01 01:05:57

解決方案1
0 2022-07-01 01:05:57