简体   繁体   English

使用列值更新 Pandas groupby 组

[英]update pandas groupby group with column value

I have a test df like this:我有一个这样的测试 df:

df = pd.DataFrame({'A': ['Apple','Apple', 'Apple','Orange','Orange','Orange','Pears','Pears'],
                    'B': [1,2,9,6,4,3,2,1]
                   })
       A    B
0   Apple   1
1   Apple   2
2   Apple   9
3   Orange  6
4   Orange  4
5   Orange  3
6   Pears   2
7   Pears   1

Now I need to add a new column with the respective %differences in col 'B'.现在我需要在 col 'B' 中添加一个具有各自 %differences 的新列。 How is this possible.这怎么可能。 I cannot get this to work.我不能让它工作。

I have looked at update column value of pandas groupby().last() Not sure that it is pertinent to my problem.我查看了 pandas groupby().last() 的更新列值不确定它是否与我的问题有关。

And this which looks promising Pandas Groupby and Sum Only One Column这看起来很有希望Pandas Groupby 和 Sum Only One Column

I need to find and insert into the col maxpercchng (all rows in group) the maximum change in col (B) per group of col 'A'.我需要找到并插入 col maxpercchng(组中的所有行)中每组 col 'A' 的 col (B) 的最大变化。 So I have come up with this code:所以我想出了这个代码:

grouppercchng = ((df.groupby['A'].max() - df.groupby['A'].min())/df.groupby['A'].iloc[0])*100

and try to add it to the group col 'maxpercchng' like so并尝试将它添加到组 col 'maxpercchng' 像这样

group['maxpercchng'] = grouppercchng

Or like so或者像这样

df_kpi_hot.groupby(['A'], as_index=False)['maxpercchng'] = grouppercchng

Does anyone know how to add to all rows in group the maxpercchng col?有谁知道如何将 maxpercchng col 添加到组中的所有行?

I believe you need transform for Series with same size like original DataFrame filled by aggregated values:我相信您需要对具有与由聚合值填充的原始 DataFrame 相同大小的系列进行transform

g = df.groupby('A')['B']
df['maxpercchng'] = (g.transform('max') - g.transform('min')) /  g.transform('first') * 100

print (df)

        A  B  maxpercchng
0   Apple  1        800.0
1   Apple  2        800.0
2   Apple  9        800.0
3  Orange  6         50.0
4  Orange  4         50.0
5  Orange  3         50.0
6   Pears  2         50.0
7   Pears  1         50.0

Or:或者:

g = df.groupby('A')['B']
df1 = ((g.max() - g.min()) / g.first() * 100).reset_index()
print (df1)

        A      B
0   Apple  800.0
1  Orange   50.0
2   Pears   50.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM