熊猫groupby在多个值

Question

Start with a sorted table: 从排序表开始：

Index | A | B | C       |  
0     | A1| 0 | Group 1 |  
1     | A1| 0 | Group 1 |  
2     | A1| 1 | Group 2 |  
3     | A1| 1 | Group 2 |  
4     | A1| 2 | Group 3 |  
5     | A1| 2 | Group 3 |  
6     | A2| 7 | Group 4 |  
7     | A2| 7 | Group 4 |

Returns records 0,1,2,3,6,7 返回记录0、1、2、3、6、7

First I want to create groups based on Columns A and B. Then I want only the first two subgroups of a Column A group returned. 首先，我想基于列A和B创建组。然后，我只希望返回列A组的前两个子组。 I want all the records returned for the subgroup. 我希望为该子组返回所有记录。

Thank you so much. 非常感谢。

Answer 1

Use pd.factorize within a groupby and filter for less than 2 在groupby使用pd.factorize并过滤少于2个

df[df.groupby('A').B.transform(lambda x: x.factorize()[0]).lt(2)]
# same as
# df[df.groupby('A').B.transform(lambda x: x.factorize()[0]) < 2]

    A  B        C
0  A1  0  Group 1
1  A1  0  Group 1
2  A1  1  Group 2
3  A1  1  Group 2
6  A2  7  Group 4
7  A2  7  Group 4

熊猫groupby在多个值

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-06-12 20:27:58

熊猫groupby在多个值

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-06-12 20:27:58

解决方案1
2 已采纳 2017-06-12 20:27:58