[英]Drop rows from dataframe with the values occuring only once in the whole column
I have a data frame like this:我有一个这样的数据框:
import pandas as pd
data = [['bob', 1], ['james', 4], ['joe', 4], ['joe', 1], ['bob', 3], ['wendy', 5], ['joe', 7]]
df = pd.DataFrame(data, columns=['name', 'score'])
print(df)
Looking like:看起来像:
name score
0 bob 1
1 james 4
2 joe 4
3 joe 1
4 bob 3
5 wendy 5
6 joe 7
I would like to drop all persons with only a single occurrence in a Pythonic way ie the result should look like:我想以 Pythonic 的方式删除所有只出现一次的人,即结果应该如下所示:
name score
0 bob 1
2 joe 4
3 joe 1
4 bob 3
6 joe 7
... and how would I do the same with entries that only have 1 or 2 occurrences? ...我将如何处理仅出现 1 次或 2 次的条目? ie IE
name score
2 joe 4
3 joe 1
6 joe 7
try this, DataFrameGroupBy.nunique
to get count of unique elements in each group & apply isin
to filter occurrences.试试这个, DataFrameGroupBy.nunique
来获取每个组中唯一元素的计数并应用isin
来过滤事件。
g = df.groupby(['name'])['score'].transform('nunique')
df[~g.isin([1])]
name score
0 bob 1
2 joe 4
3 joe 1
4 bob 3
6 joe 7
df[~g.isin([1,2])]
name score
2 joe 4
3 joe 1
6 joe 7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.