[英]Weighted average in pandas with weights based on the value of a column?
I have the following dataframe我有以下数据框
id type side score
601166 p right 2
601166 p left 6
601166 p right 2
601166 p left 4
601166 r left 2
601166 r left 2
601166 r right 6
601166 2
601009 r left 6
601009 r right 8
601939 p left 2
601939 p left 2
I have calculated the average score for each id, type and side with:我已经计算了每个 id、type 和 side 的平均分数:
df_result=df.groupby(["id", "type","side"])["score"].mean()
id type side mean
601166 p right 2
601166 p left 5
601166 r right 6
601166 r left 2
601166 2
But now I would like to calculate the average score for each id and type and add weights to the average scores on each side: the lowest average score for the left or right side counts for 75%, the highest score for 25%.但是现在我想计算每个 id 和 type 的平均分数,并为每边的平均分数添加权重:左侧或右侧的最低平均分数为 75%,最高分数为 25%。
An example result for id 601166, first calculate the average for each side. id 601166 的示例结果,首先计算每边的平均值。 The side with the lowest score (right) counts for 75%, the other side (left) for 25%.
得分最低的一侧(右)占 75%,另一侧(左)占 25%。 Empty values can be skipped.
可以跳过空值。
id type mean
601166 p 2,75
601166 r 3
Any idea how I can add this to my code?知道如何将其添加到我的代码中吗?
Would something like this suffice?这样的事情就足够了吗?
df_result = df.groupby(["id", "type", "side"])["score"].mean()
g = df_result.groupby(["id", "type"])
g.min() * 0.75 + g.max() * 0.25
id type
601009 r 6.50
601166 p 2.75
r 3.00
601939 p 2.00
Name: score, dtype: float64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.