简体   繁体   English

删除 pandas 中一列中不存在至少两个差异值的组

[英]Remove groups where there is not at least two difference values within a column in pandas

I have a dataframe such as我有一个 dataframe 比如

COL1 COL2 SP
G1   A    SP1
G1   A    SP2
G2   B    SP1
G2   B    SP1
G3   C    SP7
G3   C    SP3
G4   A    SP8
G4   A    SP8

And I would like to only keep COL1 COL2 groups where there is at least two different SP names .我只想保留至少有两个不同SP 名称COL1 COL2组。

I would then get:然后我会得到:

COL1 COL2 SP
G1   A    SP1
G1   A    SP2
G3   C    SP7
G3   C    SP3

Let us try transform with nunique让我们尝试使用nunique进行transform

out = df[df.groupby(['COL1','COL2'])['SP'].transform('nunique')>1]
Out[245]: 
  COL1 COL2   SP
0   G1    A  SP1
1   G1    A  SP2
4   G3    C  SP7
5   G3    C  SP3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM