[英]Pandas groupby on multiple values
Start with a sorted table: 从排序表开始:
Index | A | B | C |
0 | A1| 0 | Group 1 |
1 | A1| 0 | Group 1 |
2 | A1| 1 | Group 2 |
3 | A1| 1 | Group 2 |
4 | A1| 2 | Group 3 |
5 | A1| 2 | Group 3 |
6 | A2| 7 | Group 4 |
7 | A2| 7 | Group 4 |
Returns records 0,1,2,3,6,7 返回记录0、1、2、3、6、7
First I want to create groups based on Columns A and B. Then I want only the first two subgroups of a Column A group returned. 首先,我想基于列A和B创建组。然后,我只希望返回列A组的前两个子组。 I want all the records returned for the subgroup.
我希望为该子组返回所有记录。
Thank you so much. 非常感谢。
Use pd.factorize
within a groupby
and filter for less than 2 在
groupby
使用pd.factorize
并过滤少于2个
df[df.groupby('A').B.transform(lambda x: x.factorize()[0]).lt(2)]
# same as
# df[df.groupby('A').B.transform(lambda x: x.factorize()[0]) < 2]
A B C
0 A1 0 Group 1
1 A1 0 Group 1
2 A1 1 Group 2
3 A1 1 Group 2
6 A2 7 Group 4
7 A2 7 Group 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.