简体   繁体   English

子集数据框和groupby大熊猫

[英]Subset dataframe and groupby pandas

everyone, I have a data frame such as : 大家,我有一个数据框,例如:

groups  name
1   A
1   B
1   C
1   D
2   E
3   F
3   G
4   H
5   I

and from that I would like to only keep in the data frame the values that are alone in a group: 因此,我只想将数据组中单独存在的值保留在数据框中:

groups  name
2   E
4   H
5   I

E,H and I are alone in their respective groups. E,H和我分别在各自的小组中。

I tried: 我试过了:

df[df.groupby(['groups']).count() == 1 ]

But it does not seem to be the solution. 但这似乎不是解决方案。

Use duplicated : 使用duplicated

df[~df.groups.duplicated(keep=False)]

   groups name
4       2    E
7       4    H
8       5    I

Or, drop_duplicates . 或者, drop_duplicates

df.drop_duplicates('groups', keep=False)

   groups name
4       2    E
7       4    H
8       5    I

GroupBy.transform用于具有与原始DataFrame相同大小的DataFrame

df[df.groupby(['groups'])['name'].transform('size') == 1 ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM