Pandas Dataframe 问题：应用函数添加带有结果的新列

Question

import pandas as pd

df = pd.DataFrame({'label': 'a a b c b c'.split(), 'Val': [2,2,6, 4,6, 8]})
df

  label  Val
0     a    2
1     a    2
2     b    6
3     c    4
4     b    6
5     c    8

df.groupby('label').apply(lambda x: x.mean())

 Val
label     
a      2.0
b      6.0
c      6.0

I'd like something like this.我想要这样的东西。 Where results are the values divided by the mean of the unique label:结果是值除以唯一标签的平均值：

label  Val  Results
0     a    2    1
1     a    2    1
2     b    6    1
3     c    4    0.6667
4     b    6    1
5     c    8    1.3333

Not entirely sure how to do it.不完全确定该怎么做。 Anyone have an idea?有人有想法吗？ Tried this but didnt work:试过这个但没有用：

df['Results'] = df.groupby('label').apply(lambda x: x/x.mean())

Answer 1

You are close, add column Val after groupby for processing this column:您已关闭，在groupby之后添加列Val以处理此列：

df['Results'] = df.groupby('label')['Val'].apply(lambda x: x/x.mean())
print (df)
  label  Val   Results
0     a    2  1.000000
1     a    2  1.000000
2     b    6  1.000000
3     c    4  0.666667
4     b    6  1.000000
5     c    8  1.333333

Another idea for improve performance with GroupBy.transform for new Series filled by aggregated values and same size like original columns, so possible divide:使用GroupBy.transform为由聚合值填充的新Series提高性能的另一个想法，与原始列的大小相同，因此可能划分：

df['Results'] = df['Val'].div(df.groupby('label')['Val'].transform('mean'))

Pandas Dataframe 问题：应用函数添加带有结果的新列

问题描述

1 个解决方案

解决方案1
3 已采纳 2020-10-27 09:32:37

Pandas Dataframe 问题：应用函数添加带有结果的新列

问题描述

1 个解决方案

解决方案1 3 已采纳 2020-10-27 09:32:37

解决方案1
3 已采纳 2020-10-27 09:32:37