[英]Chainable weighted average calculation in Pandas
I'm new to Pandas and want to convert the following simple R code to Pandas for computing both the average and weighted average of a column (in practice, there are many more columns to be aggregated).我是 Pandas 的新手,想将以下简单的 R 代码转换为 Pandas 以计算列的平均值和加权平均值(实际上,还有更多列需要聚合)。 The solution has to be chainable, as there are multiple steps both before and after this calculation.解决方案必须是可链接的,因为在此计算之前和之后都有多个步骤。 I have looked at solutions using the apply function ( Calculate weighted average using a pandas/dataframe ), but then it seems that one either has to do the full aggregation step (on all, perhaps unrelated, columns) inside inside the apply function, which I find ugly, or compute the average and weighted average separately and then afterwards do a table join.我已经查看了使用 apply function ( 使用 pandas/dataframe 计算加权平均值)的解决方案,但似乎要么必须在 apply function 内部执行完整的聚合步骤(在所有可能不相关的列上),我觉得丑陋,或者分别计算平均值和加权平均值,然后进行表连接。 What is the state of the art way to do this in Pandas? state 在 Pandas 中执行此操作的艺术方式是什么?
df = data.frame(batch=c("A", "A", "B", "B", "C","C"), value=1:6, weight=1:6)
df %>%
group_by(batch) %>%
summarise(avg = mean(value), avg_weighted = sum(value*weight)/sum(weight))
# A tibble: 3 x 3
batch avg avg_weighted
<chr> <dbl> <dbl>
1 A 1.5 1.67
2 B 3.5 3.57
3 C 5.5 5.55
And here my Pandas attempt:在这里我的 Pandas 尝试:
df2 = pd.DataFrame({'batch': ["A", "A", "B", "B", "C", "C"], 'value':[1,2,3,4,5,6], 'weight':[1,2,3,4,5,6]})
def agg_step(grp):
return pd.DataFrame({'avg':[grp['value'].mean()],
'avg_weighted':np.average(grp['value'], weights=grp['weight'])})
(df2.
groupby('batch')
.apply(agg_step)
.reset_index()
.drop(columns='level_1')
)
Out[93]:
batch avg avg_weighted
0 A 1.5 1.666667
1 B 3.5 3.571429
2 C 5.5 5.545455
This should work:这应该有效:
(df2.groupby("batch")
.agg({
"value": [
"mean",
lambda x: np.average(x, weights=df2.loc[x.index, "weight"])
]
}))
based on https://stackoverflow.com/a/31521177/1011724基于https://stackoverflow.com/a/31521177/1011724
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.