简体   繁体   中英

how to calculate weighted average or sum by groupby from a list?

I have a data frame and a list of weight as follows

import pandas as pd
import numpy as np

data = [
  ['A',1,2,3,4],
  ['A',5,6,7,8],
  ['A',9,10,11,12],
  ['B',13,14,15,16],
  ['B',17,18,19,20],
  ['B',21,22,23,24],
  ['B',25,26,27,28],
  ['C',29,30,31,32],
  ['C',33,34,35,36],
  ['C',37,38,39,40],

]

df = pd.DataFrame(data, columns=['Name', 'num1', 'num2', 'num3', 'num4'])
df

Now I want to calculate weighted average with the help of list

weights = [10,20,30,40] = 100


for cols in df.columns:
    df[cols]=df.groupby(['dpid'])[cols].transform(lambda x:)

But I am not sure how to calculate the weighted average from each Name.

For A it has only three rows so, it should be calculated for num1 as (1*30+5*30+9*40)/100 if I am not wrong. and same for columns num2,num3, and num4.

For B it has 4 rows, so, it should be calculated as (13*10+17*20+21*30+25*40)/100 .

Can somebody help me with this?

IIUC, you can use a groupby.apply :

df.groupby('Name').agg(lambda g: sum(g*weights[:len(g)])/sum(weights[:len(g)]))

output:

           num1       num2       num3       num4
Name                                            
A      6.333333   7.333333   8.333333   9.333333
B     21.000000  22.000000  23.000000  24.000000
C     34.333333  35.333333  36.333333  37.333333

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM