简体   繁体   English

Python Pandas平均值和加权平均值

[英]Python Pandas mean and weighted Average

I am new to python pandas. 我是python熊猫的新手。 Any help will be much appreciated 任何帮助都感激不尽

This is my raw data: 这是我的原始数据:

         Feed  Close  Sector  Market_Cap
Date
2015-09-18 A   5.60  Property  50    
2015-09-21 A   5.60  Property  20    
2015-09-23 A   5.60  Property  30    
2015-09-18 ABC 0.67  Property  50    
2015-09-21 ABC 0.66  Property  80     
2015-09-18 DA  0.67  Mining    65    
2015-09-21 KK  1.66  Mining    80    

what I would like to get is this: 我想得到的是:

1 Create a new column call Mean to calculate average market Cap for each feed. 1创建一个新的列调用均值以计算每个提要的平均市值。

2 Find weighted average. 2查找加权平均值。

This is what I want:
         Feed  Close  Sector   Market_Cap   Mean   Sector_WeightedAvg
Date
2015-09-18 A   5.60  Property  50           33.33      33.33/(33.33+65) 
2015-09-21 A   5.60  Property  20           33.33      33.33/(33.33+65)
2015-09-23 A   5.60  Property  30           33.33      33.33/(33.33+65)
2015-09-18 ABC 0.67  Property  50           65         65/(33.33+65)
2015-09-21 ABC 0.66  Property  80           65         65/(33.33+65) 
2015-09-18 DA  0.67  Mining    65           62         62/(62+80)
2015-09-21 KK  1.66  Mining    80           80         80/(62+80)

This is my current code for mean which I get NaN: 这是我目前获得的NaN均值代码:

df3= pd.DataFrame(df3)
df3['Mean'] = df3.groupby(by=['Sector'])[ Market_Cap].mean()  

         Feed  Close  Sector   Market_Cap   Mean   
Date
2015-09-18 A   5.60  Property  50           NaN       
2015-09-21 A   5.60  Property  20           NaN      
2015-09-23 A   5.60  Property  30           NaN      
2015-09-18 ABC 0.67  Property  50           NaN             

and for weighted average code: 对于加权平均代码:

df2['WeightedAverage'] =df3[ Market_Cap].value /df3['Mean'].value

I got the error: 我得到了错误:

AttributeError: 'Series' object has no attribute 'value' AttributeError:“系列”对象没有属性“值”

IIUC you can use transform and mean . IIUC可以使用transformmean

Weighted Average is column Mean divided by sum of unique values of column Mean and df3 is group by column Sector . Weighted Average是柱Mean通过柱的唯一值的总和除以Meandf3是由列组Sector

print df3
          Feed  Close    Sector  Market_Cap
Date                                        
2015-09-18    A   5.60  Property          50
2015-09-21    A   5.60  Property          20
2015-09-23    A   5.60  Property          30
2015-09-18  ABC   0.67  Property          50
2015-09-21  ABC   0.66  Property          80
2015-09-18   DA   0.67    Mining          65
2015-09-21   KK   1.66    Mining          80

df3['Mean'] = df3.groupby(by=['Feed'])['Market_Cap'].transform('mean')   
df3['WeightedAverage'] = df3['Mean'] / df3.groupby(by=['Sector'])[ 'Mean'].transform(lambda x: sum(x.unique())) 
print df3
           Feed  Close    Sector  Market_Cap       Mean  WeightedAverage
Date                                                                    
2015-09-18    A   5.60  Property          50  33.333333         0.338983
2015-09-21    A   5.60  Property          20  33.333333         0.338983
2015-09-23    A   5.60  Property          30  33.333333         0.338983
2015-09-18  ABC   0.67  Property          50  65.000000         0.661017
2015-09-21  ABC   0.66  Property          80  65.000000         0.661017
2015-09-18   DA   0.67    Mining          65  65.000000         0.448276
2015-09-21   KK   1.66    Mining          80  80.000000         0.551724

Try a combination of transform('sum'), mean 尝试组合transform('sum'),表示

In [5]: df
Out[5]: 
   Close Feed  Market_Cap    Sector
0   5.60    A          50  Property
1   5.60    A          20  Property
2   5.60    A          30  Property
3   0.67  ABC          50  Property
4   0.66  ABC          80  Property
5   0.67   DA          65    Mining
6   1.66   KK          80    Mining

In [6]: g = df.groupby(['Sector', 'Feed'])

.. ..

In [7]: c = g.Market_Cap.mean()

In [8]: c
Out[8]: 
Sector    Feed
Mining    DA      65.000000
          KK      80.000000
Property  A       33.333333
          ABC     65.000000
Name: Market_Cap, dtype: float64

In [9]: d = c.groupby(level=0).transform('sum')

In [10]: d
Out[10]: 
Sector    Feed
Mining    DA      145.000000
          KK      145.000000
Property  A        98.333333
          ABC      98.333333
dtype: float64

.. ..

In [11]: df['Mean'] = df.apply(lambda x: c[x.Sector, x.Feed], axis=1)

In [12]: df['Weighted_Avg'] = df.apply(lambda x: c[x.Sector, x.Feed] / d[x.Sector, x.Feed], axis=1)

In [13]: df
Out[13]: 
   Close Feed  Market_Cap    Sector       Mean  Weighted_Avg
0   5.60    A          50  Property  33.333333      0.338983
1   5.60    A          20  Property  33.333333      0.338983
2   5.60    A          30  Property  33.333333      0.338983
3   0.67  ABC          50  Property  65.000000      0.661017
4   0.66  ABC          80  Property  65.000000      0.661017
5   0.67   DA          65    Mining  65.000000      0.448276
6   1.66   KK          80    Mining  80.000000      0.551724

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM