從Pandas DataFrame中按計數刪除行

Question

使用以下DataFrame ...

                     line_date line_track  line_race  c1pos
 horse_name                                                
 Grand Cicero       2013-03-10         GP          9      9
 Clever Story       2013-09-13        BEL          7      7
 Distorted Dream    2013-10-04        BEL          4      2
 Distorted Dream    2013-09-13        BEL          7      5
 Distorted Dream    2013-04-27        BEL          6      2
 Distorted Dream    2012-10-24        BEL          4      2
 Distorted Dream    2012-09-12        BEL          2      3
 Distorted Dream    2012-06-30        BEL          8      4
 Distorted Dream    2012-06-09        BEL          2      4
 Mr. O'Leary        2013-10-13        BEL          5      5
 Mr. O'Leary        2013-08-29        SAR          7      6
 Mr. O'Leary        2013-05-27        BEL          6      5
 In the Dark        2013-10-13        BEL          5      7
 In the Dark        2013-09-22        BEL          5      7
 In the Dark        2013-08-03        SAR          2      7
 In the Dark        2012-11-24        AQU          3      7
 In the Dark        2012-10-18        BEL          6      6
 Bred to Boss       2013-10-26        PRX          3      5
 Bred to Boss       2013-10-06        PRX          6      3
 Bred to Boss       2012-08-18        SAR          4      1

...索引設置為horse_name 。 我需要將每個“修剪”到一定數量。 例如，“扭曲的夢”有七個記錄。 我需要將所有具有多於三個記錄的記錄減少到三個，因此它會產生一個類似於以下記錄的DataFrame。 有一種快速簡便的方法來做到這一點嗎？

                     line_date line_track  line_race  c1pos
 horse_name                                                
 Grand Cicero       2013-03-10         GP          9      9
 Clever Story       2013-09-13        BEL          7      7
 Distorted Dream    2013-10-04        BEL          4      2
 Distorted Dream    2013-09-13        BEL          7      5
 Distorted Dream    2013-04-27        BEL          6      2
 Mr. O'Leary        2013-10-13        BEL          5      5
 Mr. O'Leary        2013-08-29        SAR          7      6
 Mr. O'Leary        2013-05-27        BEL          6      5
 In the Dark        2013-10-13        BEL          5      7
 In the Dark        2013-09-22        BEL          5      7
 In the Dark        2013-08-03        SAR          2      7
 Bred to Boss       2013-10-26        PRX          3      5
 Bred to Boss       2013-10-06        PRX          6      3
 Bred to Boss       2012-08-18        SAR          4      1

Answer 1

通常， groupby來營救！ 值得一讀的文檔，因為有很多有用的技巧可以借鑒。

>>> df.groupby(level=0, sort=False, as_index=False).head(3)
                  line_date line_track  line_race  c1pos
horse_name                                              
Grand Cicero     2013-03-10         GP          9      9
Clever Story     2013-09-13        BEL          7      7
Distorted Dream  2013-10-04        BEL          4      2
Distorted Dream  2013-09-13        BEL          7      5
Distorted Dream  2013-04-27        BEL          6      2
Mr. O'Leary      2013-10-13        BEL          5      5
Mr. O'Leary      2013-08-29        SAR          7      6
Mr. O'Leary      2013-05-27        BEL          6      5
In the Dark      2013-10-13        BEL          5      7
In the Dark      2013-09-22        BEL          5      7
In the Dark      2013-08-03        SAR          2      7
Bred to Boss     2013-10-26        PRX          3      5
Bred to Boss     2013-10-06        PRX          6      3
Bred to Boss     2012-08-18        SAR          4      1

或者，如果您想要最后3個：

>>> df.groupby(level=0, sort=False, as_index=False).tail(3)

（ sort=False只是用來保留原始的馬序；如果您不關心它，可以將其刪除。）

您還可以在line_date列上排序（更安全的line_date是line_date其轉換為datetime ，但是YYYY-MM-DD字符串將按原樣正確排序），並使用相同的head / tail方法按時間順序選擇前三個或后三個。

從Pandas DataFrame中按計數刪除行

問題描述

1 個解決方案

解決方案1
1 已采納 2013-11-10 00:07:10

從Pandas DataFrame中按計數刪除行

問題描述

1 個解決方案

解決方案1 1 已采納 2013-11-10 00:07:10

解決方案1
1 已采納 2013-11-10 00:07:10