简体   繁体   English

大熊猫保持重复的最高价值

[英]Pandas keep duplicated with highest value

I have data similar to: 我有类似的数据:

id value duplicate
a   200  yes
a   12   yes
b   42   yes
c   12   no
b   532  yes
b   21   yes
...

To track the duplicates I use df['duplicate'] = df.duplicated('id', keep=False) However, I would like to keep the ones with the highest value and either mark or drop the other duplicates. 为了跟踪重复,我使用df['duplicate'] = df.duplicated('id', keep=False)但是,我想保留具有最高value的那些,并标记或删除其他重复项。 Any suggestions? 有什么建议么?

Ah I don't know why I didn't think of this first. 啊,我不知道为什么我没想到这个。 df.sort(['id', 'value']) df['is_duplicated'] = df.duplicated('id', keep='first')

sorry! 抱歉!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python Pandas:保持列值最高的行 - Python pandas: keep row with highest column value 当两列重复时删除,但根据第三列的值保留(熊猫) - Remove when 2 columns are duplicated, but keep based on value of a third column (pandas) 如何在 pandas 的 merge_asof 中保留重复的“on”列值行 - How to keep duplicated “on” column value rows in merge_asof in pandas 熊猫在每 n 连续行中保持最高值 - Pandas keep highest value in every n consecutive rows 如何在熊猫中使用groupby保持具有最高值的另一列的值 - how to keep the value of a column that has the highest value on another column with groupby in pandas 列表保持此值在 2 中最高 - list keep this value the highest among 2 删除重复但优先保留在熊猫中 - Remove duplicated but with priority for keep first in pandas 删除重复的行,但将具有特定值的行保留在一列中(pandas python) - Removing duplicated rows but keep the ones with a particular value in one column (pandas python) 在熊猫中,如何删除所有子行,但在multiIndex数据帧的特定列中保留值最高的子行? - In Pandas how to remove all subrows but keep one which has the highest value in a specific column in a multiIndex dataframe? Pandas 如何从一列创建重复列表,并且只保留对应列的最大值? - Pandas How do I create a list of duplicates from one column, and only keep the highest value for the corresponding columns?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM