[英]Calculate weighted average based on 2 columns using a pandas/dataframe
I have the following dataframe df. 我有以下数据帧df。 I want to calculate a weighted average grouped by each date and Sector level
我想计算按每个日期和行业级别分组的加权平均值
date Equity value Sector Weight
2000-01-31 TLRA 20 RG Index 0.20
2000-02-28 TLRA 30 RG Index 0.20
2000-03-31 TLRA 40 RG Index 0.20
2000-01-31 RA 50 RG Index 0.30
2000-02-28 RA 60 RG Index 0.30
2000-03-31 RA 70 RG Index 0.30
2000-01-31 AAPL 80 SA Index 0.50
2000-02-28 AAPL 90 SA Index 0.50
2000-03-31 AAPL 100 SA Index 0.50
2000-01-31 SPL 110 SA Index 0.60
2000-02-28 SPL 120 SA Index 0.60
2000-03-31 SPL 130 SA Index 0.60
There can be many Equity
under a Sector
. 一个
Sector
下可以有很多Equity
。 I want Sector level weighted Average based on Weight column. 我想要基于权重列的行业级加权平均值。
Expected Output: 预期产出:
date RG Index SA Index
2000-01-31 19 106
2000-02-28 24 117
2000-03-31 29 138
I tried below code, but i am not getting expected output. 我试过下面的代码,但我没有得到预期的输出。 Please help
请帮忙
g = df.groupby('Sector')
df['wa'] = df.value / g.value.transform("sum") * df.Weight
df.pivot(index='Sector', values='wa')
More like pivot
problem first assign
a new columns as product of value
and weight
更像是
pivot
问题首先assign
新列assign
为value
和weight
乘积
df.assign(V=df.value*df.Weight).pivot_table(index='date',columns='Sector',values='V',aggfunc='sum')
Out[328]:
Sector RGIndex SAIndex
date
2000-01-31 19.0 106.0
2000-02-28 24.0 117.0
2000-03-31 29.0 128.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.