简体   繁体   中英

Solving multiple linear equations using Pandas

I have what I think is a very interesting problem here but have little idea how I can go about solving it computationally or whether a Python dataframe is appropriate for this purpose. I have data like so:

    SuperGroup   Group  Code  Weight Income
8   E1           E012   a     0.5    1000
9   E1           E012   b     0.2    1000
10  E1           E013   b     0.2    1000
11  E1           E013   c     0.3    1000

Effectively, 'Code' has a one-to-one relationship with 'Weight'.

'SuperGroup' has a one-to-one relationship with 'Income'.

A SuperGroup is composed of many Groups and a Group has many Codes.

I am attempting to distribute the income according to the combined weights of codes within that group so for E012 this is (0.5*0.2 = 0.1) and for E013 this is (0.2*0.3 = 0.06) As a proportion of their total, E012s becomes 0.625 (0.1/(0.1+0.06) and E013s becomes 0.375 (0.06/(0.1+0.06) .

The dataframe can be collapsed and re-written as:

    SuperGroup   Group  Code  CombinedWeight Income
8   E1           E012   a,b   0.625          1000
10  E1           E013   b,c   0.375          1000

I am capable of producing the above dataframe, but my next step is to apply the weights to the income to distribute it in such a way that it averages to 1000 still but reflects the size of the weight of the group it is associated with.

Letting x=0.625 and y=0.375 then x=1.67y

Additionally, (x+y)/2 = 1000 note: my data often has several groups present in a supergroup so it could be more than 2 resulting in a system of linear equations if my understanding is correct

Solving simultaneously produces 1250 and 750 as the weighted incomes. The dataframe can be re-written as:

    SuperGroup   Group  Code  Income
8   E1           E012   a,b   1250
10  E1           E013   b,c   750

which is effectively how I need it. Any guidance is warmly appreciated.

First we agg the DataFrame on ['SuperGroup', 'Group']

res = (df.groupby(['SuperGroup', 'Group'])
          .agg({'Weight': lambda x: x.cumprod().iloc[-1],
                'Code': ','.join,
                'Income': 'first'}))

Then we re-adjust the Income within each SuperGroup with the help of transform :

s = res.groupby(level='SuperGroup')
res['Income'] = s.Income.transform('sum')*res.Weight/s.Weight.transform('sum')

                  Weight Code  Income
SuperGroup Group                     
E1         E012     0.10  a,b  1250.0
           E013     0.06  b,c   750.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM