[英]Selecting unique observations in a pandas data frame
I have a pandas
data frame with a column uniqueid
. 我有一个带有
uniqueid
列的pandas
数据框。 I would like to remove all duplicates from the data frame based on this column, such that all remaining observations are unique. 我想基于此列从数据框中删除所有重复项,以便所有剩余的观察结果都是唯一的。
Use the duplicated
method 使用
duplicated
方法
Since we only care if uniqueid
( A
in my example) is duplicated, select that and call duplicated
on that series. 因为我们只关心
uniqueid
(我的例子中的A
)是否重复,所以选择它并在该系列上调用duplicated
。 Then use the ~
to flip the bools. 然后使用
~
来翻转bool。
In [90]: df = pd.DataFrame({'A': ['a', 'b', 'b', 'c'], 'B': [1, 2, 3, 4]})
In [91]: df
Out[91]:
A B
0 a 1
1 b 2
2 b 3
3 c 4
In [92]: df['A'].duplicated()
Out[92]:
0 False
1 False
2 True
3 False
Name: A, dtype: bool
In [93]: df.loc[~df['A'].duplicated()]
Out[93]:
A B
0 a 1
1 b 2
3 c 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.