Groupby 不丢失列

Question

I'm having an issue with a pandas dataframe.我遇到了熊猫数据框的问题。 I have a dataframe with three columns , the first 2 are identifiers (str), and the third is a number.我有一个包含三列的数据框，前两个是标识符（str），第三个是一个数字。

I would like to group it so that i get the first column the third as a max, and the second column which index corresponding to the third.我想对其进行分组，以便将第一列第三列作为最大值，将第二列的索引对应于第三列。

That's not quite clear so let's give an example.这不是很清楚，所以让我们举个例子。 My dataframe looks like:我的数据框看起来像：

    id1              id2                amount
0   first_person     first_category     18
1   first_person     second_category    37
2   second_person    first_category     229
3   second_person    third_category     23

The code for it if you need:如果需要，它的代码：

df = pd.DataFrame([['first_person','first_category',18],['first_person','second_category',37],['second_person','first_category',229],['second_person','third_category',23]],columns = ['id1','id2','amount'])

And I would like to get:我想得到：

    id1              id2                amount
0   first_person     second_category    37
1   second_person    third_category     229

I have tried a groupby method, but it makes me loose the second column:我尝试了 groupby 方法，但它让我失去了第二列：

result = df.groupby(['id1'],as_index=False).agg({'amount':np.max})

Answer 1

IIUC you want to groupby on 'id1' and determine the row with the largest amount using idxmax and use this to index into your original df: IIUC要groupby在“ID1”，并使用与确定量最大的行idxmax并使用该索引到你原来的DF：

In [9]:
df.loc[df.groupby('id1')['amount'].idxmax()]

Out[9]:
             id1              id2  amount
1   first_person  second_category      37
2  second_person   first_category     229

Groupby 不丢失列

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-04-26 09:50:09

Groupby 不丢失列

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-04-26 09:50:09

解决方案1
2 已采纳 2016-04-26 09:50:09