我將如何根據 Pandas 中的另一個行條件在 object 組中排名？示例包括

Question

下面的 dataframe 有 4 列：runner_name,race_date, height_in_inches,top_ten_finish。

我想按 race_date 分組，如果跑步者在該 race_date 中名列前十，則將他的 height_in_inches 排在該 race_date 中名列前十的其他跑步者中。 我該怎么做？

這是原來的 dataframe：

>>> import pandas as pd
>>> d = {"runner":['mike','paul','jim','dave','douglas'],
...     "race_date":['2019-02-02','2019-02-02','2020-02-02','2020-02-01','2020-02-01'],
...      "height_in_inches":[72,68,70,74,73],
...     "top_ten_finish":["yes","yes","no","yes","no"]}
>>> df = pd.DataFrame(d)
>>> df
    runner   race_date  height_in_inches top_ten_finish
0     mike  2019-02-02                72            yes
1     paul  2019-02-02                68            yes
2      jim  2020-02-02                70             no
3     dave  2020-02-01                74            yes
4  douglas  2020-02-01                73             no
>>>

這就是我想要的結果。 請注意，如果他們沒有進入比賽的前 10 名，他們將如何獲得該新列的值 0。

    runner   race_date  height_in_inches top_ten_finish  if_top_ten_height_rank
0     mike  2019-02-02                72            yes                       1
1     paul  2019-02-02                68            yes                       2
2      jim  2020-02-02                70             no                       0
3     dave  2020-02-01                74            yes                       1
4  douglas  2020-02-01                73             no                       0

謝謝！

Answer 1

我們可以使用groupby + filter with rank

df['rank']=df[df.top_ten_finish.eq('yes')].groupby('race_date')['height_in_inches'].rank(ascending=False)
df['rank'].fillna(0,inplace=True)
df
Out[87]: 
    runner   race_date  height_in_inches top_ten_finish  rank
0     mike  2019-02-02                72            yes   1.0
1     paul  2019-02-02                68            yes   2.0
2      jim  2020-02-02                70             no   0.0
3     dave  2020-02-01                74            yes   1.0
4  douglas  2020-02-01                73             no   0.0

Answer 2

您可以對groupby()進行過濾和排名，然后分配回去：

df['if_top_ten_height_rank'] = (df.loc[df['top_ten_finish']=='yes','height_in_inches']
                                   .groupby(df['race_date']).rank(ascending=False)
                                   .reindex(df.index, fill_value=0)
                                   .astype(int)
                                )

Output：

    runner    race_date      height_in_inches  top_ten_finish      if_top_ten_height_rank
--  --------  -----------  ------------------  ----------------  ------------------------
 0  mike      2019-02-02                   72  yes                                      1
 1  paul      2019-02-02                   68  yes                                      2
 2  jim       2020-02-02                   70  no                                       0
 3  dave      2020-02-01                   74  yes                                      1
 4  douglas   2020-02-01                   73  no                                       0

我將如何根據 Pandas 中的另一個行條件在 object 組中排名？示例包括

問題描述

2 個解決方案

解決方案1
4 2020-06-11 13:47:55

解決方案2
2 2020-06-11 13:48:27

我將如何根據 Pandas 中的另一個行條件在 object 組中排名？ 示例包括

問題描述

2 個解決方案

解決方案1 4 2020-06-11 13:47:55

解決方案2 2 2020-06-11 13:48:27

我將如何根據 Pandas 中的另一個行條件在 object 組中排名？示例包括

解決方案1
4 2020-06-11 13:47:55

解決方案2
2 2020-06-11 13:48:27