[英]Pandas Dataframe Question: Apply function add new column with results
import pandas as pd
df = pd.DataFrame({'label': 'a a b c b c'.split(), 'Val': [2,2,6, 4,6, 8]})
df
label Val
0 a 2
1 a 2
2 b 6
3 c 4
4 b 6
5 c 8
df.groupby('label').apply(lambda x: x.mean())
Val
label
a 2.0
b 6.0
c 6.0
I'd like something like this.我想要这样的东西。 Where results are the values divided by the mean of the unique label:结果是值除以唯一标签的平均值:
label Val Results
0 a 2 1
1 a 2 1
2 b 6 1
3 c 4 0.6667
4 b 6 1
5 c 8 1.3333
Not entirely sure how to do it.不完全确定该怎么做。 Anyone have an idea?有人有想法吗? Tried this but didnt work:试过这个但没有用:
df['Results'] = df.groupby('label').apply(lambda x: x/x.mean())
You are close, add column Val
after groupby
for processing this column:您已关闭,在groupby
之后添加列Val
以处理此列:
df['Results'] = df.groupby('label')['Val'].apply(lambda x: x/x.mean())
print (df)
label Val Results
0 a 2 1.000000
1 a 2 1.000000
2 b 6 1.000000
3 c 4 0.666667
4 b 6 1.000000
5 c 8 1.333333
Another idea for improve performance with GroupBy.transform
for new Series
filled by aggregated values and same size like original columns, so possible divide:使用GroupBy.transform
为由聚合值填充的新Series
提高性能的另一个想法,与原始列的大小相同,因此可能划分:
df['Results'] = df['Val'].div(df.groupby('label')['Val'].transform('mean'))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.