简体   繁体   中英

sum of combinations of columns

I have a dataset:

df = pd.DataFrame([
    {'cpu': 1, 'price': 101},
    {'cpu': 1, 'price': 99},
    {'cpu': 4, 'price': 180},
    {'cpu': 8, 'price': 199},
    {'cpu': 9, 'price': 202},
])

   cpu  price
0    1    101
1    1     99
2    4    180
3    8    199
4    9    202

I need to sum all possible combinations of cpu + price(expected df ):

   cpu_total  total_price  cpu  price
0          2          200    1    101    <- first combination (1 + 1 = 2, 101 + 99 = 200) 
1          2          200    1     99
2          6          380    1    101    <- next combination (1 + 1 + 4 = 6, 101 + 99 + 180 = 380)
3          6          380    1     99
4          6          380    4    180
...  other combinations ...
5         17          401    8    199    <- last combination (8 + 9 = 17, 199 + 202 = 401)
6         17          401    9    202

I have tried using itertools.combinations / itertools.product but it looks like there should be a simpler solution

You can use a slightly modified powerset recipe to get all of the possible combinations of 2 to N elements. Then you can create the DataFrames by slicing the Index. Labeling each uniquely allows you to then use groupby + transform to get the totals for the group.

from itertools import chain, combinations
import pandas as pd

def powerset(iterable):
    "powerset([1,2,3]) --> (1,2) (1,3) (2,3) (1,2,3)"
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(2, len(s)+1))

ps = powerset(df.index)
df1 = pd.concat([df.loc[x,:].assign(grp=i) for i,x in enumerate(ps)])

for col in ['cpu', 'price']:
    df1[f'{col}_total'] = df1.groupby('grp')[col].transform('sum')

print(df1)
    cpu  price  grp  cpu_total  price_total
0     1    101    0          2          200
1     1     99    0          2          200
0     1    101    1          5          281
2     4    180    1          5          281
0     1    101    2          9          300
..  ...    ...  ...        ...          ...
0     1    101   25         23          781
1     1     99   25         23          781
2     4    180   25         23          781
3     8    199   25         23          781
4     9    202   25         23          781

[75 rows x 5 columns]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM