简体   繁体   English

使用pandas / dataframe计算基于2列的加权平均值

[英]Calculate weighted average based on 2 columns using a pandas/dataframe

I have the following dataframe df. 我有以下数据帧df。 I want to calculate a weighted average grouped by each date and Sector level 我想计算按每个日期和行业级别分组的加权平均值

 date     Equity    value    Sector   Weight
2000-01-31  TLRA    20      RG Index     0.20
2000-02-28  TLRA    30      RG Index     0.20
2000-03-31  TLRA    40      RG Index     0.20
2000-01-31   RA     50      RG Index     0.30
2000-02-28   RA     60      RG Index     0.30
2000-03-31   RA     70      RG Index     0.30
2000-01-31  AAPL    80      SA Index     0.50
2000-02-28  AAPL    90      SA Index     0.50
2000-03-31  AAPL    100     SA Index     0.50
2000-01-31  SPL     110     SA Index     0.60
2000-02-28  SPL     120     SA Index     0.60
2000-03-31  SPL     130     SA Index     0.60

There can be many Equity under a Sector . 一个Sector下可以有很多Equity I want Sector level weighted Average based on Weight column. 我想要基于权重列的行业级加权平均值。

Expected Output: 预期产出:

date        RG Index       SA Index
2000-01-31  19               106  
2000-02-28  24               117
2000-03-31  29               138 

I tried below code, but i am not getting expected output. 我试过下面的代码,但我没有得到预期的输出。 Please help 请帮忙

g = df.groupby('Sector')
df['wa'] = df.value / g.value.transform("sum") * df.Weight
df.pivot(index='Sector', values='wa')

More like pivot problem first assign a new columns as product of value and weight 更像是pivot问题首先assign新列assignvalueweight乘积

df.assign(V=df.value*df.Weight).pivot_table(index='date',columns='Sector',values='V',aggfunc='sum')
Out[328]: 
Sector      RGIndex  SAIndex
date                        
2000-01-31     19.0    106.0
2000-02-28     24.0    117.0
2000-03-31     29.0    128.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM