[英]How to calculate a weighted average in Python for each unique value in two columns?
假设“类型”列不影响您的计算,您可以使用groupby
获得平均值。 这是数据:
df = pd.DataFrame({'borough': ['b1', 'b2']*6, 'year': [2008, 2009, 2010, 2011]*3,
'average': np.random.randint(low=100, high=200, size=12),
'nobs': np.random.randint(low=1, high=40, size=12)})
print(df):
borough year average nobs
0 b1 2008 166 1
1 b2 2009 177 35
2 b1 2010 114 27
3 b2 2011 187 18
4 b1 2008 193 2
5 b2 2009 105 27
6 b1 2010 114 36
7 b2 2011 144 3
8 b1 2008 114 39
9 b2 2009 157 6
10 b1 2010 133 17
11 b2 2011 176 12
我们添加一个新列,它是 average 和 nobs 列的乘积:
df['average x nobs'] = df['average']*df['nobs']
newdf = pd.DataFrame({'weighted average': df.groupby(['borough', 'year']).sum()['average x nobs']/df.groupby(['borough', 'year']).sum()['nobs']})
print(newdf):
weighted average
borough year
b1 2008 119.000000
2010 118.037500
b2 2009 146.647059
2011 179.090909
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.