更有效的方法来迭代 groupby Pandas 数据框？

Question

I have this snippet code that groupby column ID from a pandas dataframe and appends in a result dataframe all the top salaries from a unique ID.我有这个片段代码，它从熊猫数据框中按列ID分组，并将来自唯一 ID 的所有最高工资附加到结果数据框中。 The code works but is kind of slow with larger files.该代码有效，但对于较大的文件来说有点慢。 I was wondering if someone could suggest a more efficient way.我想知道是否有人可以提出更有效的方法。

groupe = df.groupby("ID")
t = (group.sort_values(by="Salary", ascending=False)[:1] for yr, group in groupe)
result = pd.DataFrame() 
     for i in t:
        result = result.append(i)

Answer 1

df.groupby('ID').max()

You can then select the salaries column.然后，您可以选择工资列。

Edit编辑

If you want to retain all other columns, even the non-numerical, this should do the job:如果您想保留所有其他列，即使是非数字列，也应该这样做：

df.sort_values(by="Salary", ascending=False).groupby('ID').first()

更有效的方法来迭代 groupby Pandas 数据框？

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-04-27 14:55:12

更有效的方法来迭代 groupby Pandas 数据框？

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-04-27 14:55:12

解决方案1
0 已采纳 2016-04-27 14:55:12