[英]Subset dataframe and groupby pandas
everyone, I have a data frame such as : 大家,我有一个数据框,例如:
groups name
1 A
1 B
1 C
1 D
2 E
3 F
3 G
4 H
5 I
and from that I would like to only keep in the data frame the values that are alone in a group: 因此,我只想将数据组中单独存在的值保留在数据框中:
groups name
2 E
4 H
5 I
E,H and I are alone in their respective groups. E,H和我分别在各自的小组中。
I tried: 我试过了:
df[df.groupby(['groups']).count() == 1 ]
But it does not seem to be the solution. 但这似乎不是解决方案。
Use duplicated
: 使用
duplicated
:
df[~df.groups.duplicated(keep=False)]
groups name
4 2 E
7 4 H
8 5 I
Or, drop_duplicates
. 或者,
drop_duplicates
。
df.drop_duplicates('groups', keep=False)
groups name
4 2 E
7 4 H
8 5 I
将GroupBy.transform
用于具有与原始DataFrame
相同大小的DataFrame
:
df[df.groupby(['groups'])['name'].transform('size') == 1 ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.