两列的 Groupby 合计百分比

Question

I have a DataFrame:我有一个 DataFrame：

df = pd.DataFrame({
    'Product': ['AA', 'AA', 'AA', 'AA', 'BB', 'BB', 'BB', 'BB'],
    'Type': ['AC', 'AC', 'AD', 'AD', 'BC', 'BC', 'BD', 'BD'],
    'Sales': [ 200, 100, 400, 100, 300, 100, 200, 500], 
    'Qty': [ 5, 3, 3, 6, 4, 7, 4, 1]})

I want to try and get the percentage of total by "Product" and "Type" for both "Sales" and "Qty".我想尝试获取“销售额”和“数量”的“产品”和“类型”占总数的百分比。 I can get the percentage of total for "Sales" and "Qty" separately.我可以分别获得“销售额”和“数量”占总数的百分比。 But I was wondering if there was a way of doing so for both columns.但我想知道是否有一种方法可以对两个列都这样做。

To get the percentage of total for one column, the code is:要获得一列的总百分比，代码是：

df['Sales'] = df['Sales'].astype(float)
df['Qty'] = df['Qty'].astype(float)
df = df[['Product', 'Type', 'Sales']]

df = df.groupby(['Product', 'Type']).agg({'Sales': 'sum'})
pcts = df.groupby(level= [0]).apply(lambda x: 100 * x / float(x.sum()))

Is there a way of get this for both columns in one go?有没有办法在一个 go 中为两列获取此信息？

Answer 1

You can chain groupby :您可以链接groupby ：

pct = lambda x: 100 * x / x.sum()

out = df.groupby(['Product', 'Type']).sum().groupby('Product').apply(pct)
print(out)

# Output
                  Sales        Qty
Product Type                      
AA      AC    37.500000  47.058824
        AD    62.500000  52.941176
BB      BC    36.363636  68.750000
        BD    63.636364  31.250000

Answer 2

You could groupby "Product" and "Type" get the totals for each group.您可以groupby “产品”和“类型”进行分组以获得每个组的总计。 Then groupby "Product" (which is level=0) again and transform sum ;然后groupby “Product”（level=0）分组并转换sum ； then divide the sum from the previous step with it:然后将上一步的总和除以它：

sm = df.groupby(['Product','Type']).sum()
out = sm / sm.groupby(level=0).transform('sum') * 100

Output: Output：

                  Sales        Qty
Product Type                      
AA      AC    37.500000  47.058824
        AD    62.500000  52.941176
BB      BC    36.363636  68.750000
        BD    63.636364  31.250000

Answer 3

One option is to get the values from individual groupbys and divide:一种选择是从各个 groupbys 中获取值并除以：

numerator = df.groupby(["Product", "Type"]).sum()
denominator = df.groupby("Product").sum()
numerator.div(denominator, level = 0, axis = 'index') * 100

                  Sales        Qty
Product Type                      
AA      AC    37.500000  47.058824
        AD    62.500000  52.941176
BB      BC    36.363636  68.750000
        BD    63.636364  31.250000

两列的 Groupby 合计百分比

问题描述

3 个解决方案

解决方案1
5 已采纳 2022-02-15 20:48:52

解决方案2
3

解决方案3
2 2022-02-15 21:08:38

两列的 Groupby 合计百分比

问题描述

3 个解决方案

解决方案1 5 已采纳 2022-02-15 20:48:52

解决方案2 3

解决方案3 2 2022-02-15 21:08:38

解决方案1
5 已采纳 2022-02-15 20:48:52

解决方案2
3

解决方案3
2 2022-02-15 21:08:38