如何按不同的子類別計數對數據框/數據透視表進行排序？

Question

我正在嘗試通過與第一列值對應的第二列的不同計數對數據框的第一列進行排序。

數據透視表中未排序的數據：

investor  company round roundSize
investor1   Foo     A      10
investor2   Bar     A      10
            Foo     A      10
investor3   Bar     A      10
                    B      15
investor4   Bar     B      15
            Baz     C      100
            Foo     A      10

排序后，表格應為：

investor  company round roundSize
investor4   Bar     B      15
            Baz     C      100
            Foo     A      10
investor2   Bar     A      10
            Foo     A      10
investor3   Bar     A      10
                    B      15
investor1   Foo     A      10

此處，investor4 的第 2 列（公司）不同計數為 3，因此investor4 和匹配值應位於頂部。

投資者 3 和投資者 1 的計數均為 2，最好對輪數或平均輪數應用二次（但不是必需的）排序。

我對 python/pandas 很陌生——但我正在努力尋找一個應用它的例子。 pandas 文檔很好，但並沒有完全觸及這類問題。

https://pandas.pydata.org/pandas-docs/version/0.15.0/reshaping.html

任何幫助將不勝感激！

Answer 1

重置索引以使 pivot 表形成 DataFrame

>>> df = df.reset_index(drop=True)
>>> df
    investor company round  roundSize
0  investor1     Foo     A         10
1  investor2     Bar     A         10
2  investor2     Foo     A         10
3  investor3     Bar     A         10
4  investor3     Bar     B         15
5  investor4     Bar     B         15
6  investor4     Baz     C        100
7  investor4     Foo     A         10

創建sort index並按該列排序

>>> df['sort_idx'] = df.groupby('investor')['company'].transform('nunique')
>>> df.sort_values('sort_idx', ascending=False)
    investor company round  roundSize  sort_idx
5  investor4     Bar     B         15         3
6  investor4     Baz     C        100         3
7  investor4     Foo     A         10         3
1  investor2     Bar     A         10         2
2  investor2     Foo     A         10         2
0  investor1     Foo     A         10         1
3  investor3     Bar     A         10         1
4  investor3     Bar     B         15         1

Answer 2

使用DataFrame.groupby + DataFrame.sort_values和DataFrame.reindex ：

df['order']=df.groupby(level=0).round.transform('size')
df=df.sort_values('order',ascending=False).reindex(columns=df.columns[:-1])
print(df)

                  round  roundSize
investor  company                 
investor4 Bar         B         15
          Baz         C        100
          Foo         A         10
investor2 Bar         A         10
          Foo         A         10
investor3 Bar         A         10
          Bar         B         15
investor1 Foo         A         10

如何按不同的子類別計數對數據框/數據透視表進行排序？

問題描述

1 個解決方案

解決方案1
1 已采納 2019-11-02 19:10:51

解決方案2
0 2019-11-02 21:14:10

如何按不同的子類別計數對數據框/數據透視表進行排序？

問題描述

1 個解決方案

解決方案1 1 已采納 2019-11-02 19:10:51

解決方案2 0 2019-11-02 21:14:10

解決方案1
1 已采納 2019-11-02 19:10:51

解決方案2
0 2019-11-02 21:14:10