根据条件选择列中的最大值

Question

I have two columns. 我有两列。 ID and Percentage. ID和百分比。 Some IDs are not unique. 有些ID不是唯一的。 Assume I have IDs 233, 233, 277, 277. And corresponding percentages: 4,5%, 7%, 3%, 1%. 假设我有ID 233、233、277、277。以及相应的百分比：4.5％，7％，3％，1％。 I need to select max. 我需要选择最高 percentage for each ID. 每个ID的百分比。 So that outcome is: 233 - 7%, 277 - 3%. 结果是：233-7％，277-3％。

I wrote code that returns max value for the whole column, not the specific non-unique ID. 我写的代码返回整个列的最大值，而不是特定的非唯一ID。

df['help_column'] = np.where(df.duplicated() ==True, max(df['percentage']),0)

As the highest value in the whole column is 33%, I get 33% for ID 233, and 33% for ID 277 instead of desired result. 因为整个列中的最高值为33％，所以ID 233为33％，ID 277为33％，而不是期望的结果。 Thanks 谢谢

Answer 1

这更像是一种transform

df['help_column'] = df.groupby('ID')['percentage'].transform('max')

Answer 2

尝试这个

df.groupby(['ID'])['percentage'].max()

根据条件选择列中的最大值

问题描述

2 个解决方案

解决方案1
3 2019-08-21 16:06:52

解决方案2
1 2019-08-21 16:07:00

根据条件选择列中的最大值

问题描述

2 个解决方案

解决方案1 3 2019-08-21 16:06:52

解决方案2 1 2019-08-21 16:07:00

解决方案1
3 2019-08-21 16:06:52

解决方案2
1 2019-08-21 16:07:00