從熊貓數據框中選擇受每列計數限制的行

Question

我有一個定義如下的數據框：

df = pd.DataFrame({'id':    [11, 12, 13, 14, 21, 22, 31, 32, 33], 
                   'class': ['A', 'A', 'A', 'A', 'B', 'B', 'C', 'C', 'C'],
                   'count': [2, 2, 2 ,2 ,1, 1, 2, 2, 2]})

對於每個類，我想選擇前n行，其中n由count列指定。 上面數據框的預期輸出將是這樣的：

我該如何實現？

Answer 1

采用：

(df.groupby('class', as_index=False, group_keys=False)
   .apply(lambda x: x.head(x['count'].iloc[0])))

輸出：

   id class  count
0  11     A      2
1  12     A      2
4  21     B      1
6  31     C      2
7  32     C      2

Answer 2

你可以用

In [771]: df.groupby('class').apply(
                     lambda x: x.head(x['count'].iloc[0])
                  ).reset_index(drop=True)
Out[771]:
   id class  count
0  11     A      2
1  12     A      2
2  21     B      1
3  31     C      2
4  32     C      2

Answer 3

使用cumcount

df[(df.groupby('class').cumcount()+1).le(df['count'])]
Out[150]: 
  class  count  id
0     A      2  11
1     A      2  12
4     B      1  21
6     C      2  31
7     C      2  32

Answer 4

這是一個按類分組的解決方案，然后查看較小數據框中的第一個值並返回相應的行。

def func(df_):
    count_val = df_['count'].values[0]
    return df_.iloc[0:count_val]

df.groupby('class', group_keys=False).apply(func)

回報

  class  count  id
0     A      2  11
1     A      2  12
4     B      1  21
6     C      2  31
7     C      2  32

從熊貓數據框中選擇受每列計數限制的行

問題描述

4 個解決方案

解決方案1
2 2018-08-16 19:16:57

解決方案2
1 已采納 2018-08-16 19:16:45

解決方案3
1 2018-08-16 19:20:55

解決方案4
0 2018-08-16 19:19:17

從熊貓數據框中選擇受每列計數限制的行

問題描述

4 個解決方案

解決方案1 2 2018-08-16 19:16:57

解決方案2 1 已采納 2018-08-16 19:16:45

解決方案3 1 2018-08-16 19:20:55

解決方案4 0 2018-08-16 19:19:17

解決方案1
2 2018-08-16 19:16:57

解決方案2
1 已采納 2018-08-16 19:16:45

解決方案3
1 2018-08-16 19:20:55

解決方案4
0 2018-08-16 19:19:17