简体   繁体   English

从 dataframe 中删除值在整列中仅出现一次的行

[英]Drop rows from dataframe with the values occuring only once in the whole column

I have a data frame like this:我有一个这样的数据框:

import pandas as pd

data = [['bob', 1], ['james', 4], ['joe', 4], ['joe', 1], ['bob', 3], ['wendy', 5], ['joe', 7]]
df = pd.DataFrame(data, columns=['name', 'score'])
print(df)

Looking like:看起来像:

    name  score
0    bob      1
1  james      4
2    joe      4
3    joe      1
4    bob      3
5  wendy      5
6    joe      7

I would like to drop all persons with only a single occurrence in a Pythonic way ie the result should look like:我想以 Pythonic 的方式删除所有只出现一次的人,即结果应该如下所示:

    name  score
0    bob      1
2    joe      4
3    joe      1
4    bob      3
6    joe      7

... and how would I do the same with entries that only have 1 or 2 occurrences? ...我将如何处理仅出现 1 次或 2 次的条目? ie IE

    name  score
2    joe      4
3    joe      1
6    joe      7

try this, DataFrameGroupBy.nunique to get count of unique elements in each group & apply isin to filter occurrences.试试这个, DataFrameGroupBy.nunique来获取每个组中唯一元素的计数并应用isin来过滤事件。

g = df.groupby(['name'])['score'].transform('nunique')

df[~g.isin([1])]

  name  score
0  bob      1
2  joe      4
3  joe      1
4  bob      3
6  joe      7

df[~g.isin([1,2])]

  name  score
2  joe      4
3  joe      1
6  joe      7

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据特定列中的空值从数据框中删除行? - How to drop rows from a dataframe as per null values in a specific column? 有没有办法通过将列值与列表中的值进行比较来从 dataframe 中删除行? - Is there a way to drop rows from a dataframe by comparing a column value to values in list? 按列值删除 Pandas DataFrame 中的行(文本) - Drop rows in Pandas DataFrame by Column values (text) pandas dropna 丢弃整个dataframe,只需要丢弃空行 - pandas dropna dropping the whole dataframe, need only to drop empty rows 如何通过列值的条件删除 DataFrame 中的行 - How to Drop rows in DataFrame by conditions on column values 删除仅在DataFrame列中出现一次的值 - Remove values that appear only once in a DataFrame column 如何仅返回具有所需列值的行 pandas dataframe - How to return only rows with required column values from pandas dataframe pyspark 删除 dataframe 中的行,以便在一列中只有 X 个不同的值 - pyspark Drop rows in dataframe to only have X distinct values in one column Pandas 如何根据所有行的值、应用于整个数据帧的特定列值向数据帧添加新列 - Pandas how add a new column to dataframe based on values from all rows, specific columns values applied to whole dataframe 仅从数据框中的一行中删除 Nan 值 - Drop only Nan values from a row in a dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM