简体   繁体   English

Pandas GroupBy:逗号分隔的总和列表

[英]Pandas GroupBy: comma separated list of sums

I have the below groupby which is summing the Amounts at the "ParentAccount" level.我有以下groupby ,它在“ParentAccount”级别对金额求和。 I am trying to show on the same line the details behind that sum.我试图在同一行上显示该金额背后的详细信息。 I have the comma separated list of accounts showing next to the amount total but would also like to add in a single column that shows a comma separated sum at the account level.我在总金额旁边显示了逗号分隔的帐户列表,但也想添加一个单独的列,在帐户级别显示逗号分隔的总和。

So for the below code I would have the following float strings in a separate column所以对于下面的代码,我将在单独的列中包含以下浮点字符串

ParentAccount 1: 3.75, 1
ParentAccount 2: 14, 10.5

Not sure of the best way to go about doing this.不确定有关执行此操作的最佳方法 go。 I tried doing a merge of two separate groupby s but think there is probably a better way of doing this.我尝试合并两个单独的groupby ,但认为可能有更好的方法。

import pandas as pd

data = {
        'ParentAccount': [1,1,1,2,2,2],
        'Account': ['A', 'A', 'C', 'D', 'D','E'],
        'Amount':  [1.5, 2.25, 1, 4.75, 9.25, 10.50],
        }

df = pd.DataFrame(data)
df_final = df.groupby('ParentAccount').agg({'Amount': 'sum', 'Account': lambda x: ','.join(x.unique()),}).add_suffix('-Net')

print(df_final)

You could groupby "ParentAccount" and "Account" to find the sum ;您可以按“ groupby ”和“Account”分组来查找sum then groupby "ParentAccount" again, and pass an unpacked dictionary to agg to do the things you want: (i) Summing the amount and (ii) join ing the unique accounts for each ParentAccount (iii) join ing the amounts per account for each ParentAccount:然后groupby "ParentAccount" 再次,并将解压缩的字典传递给agg来做你想做的事情:(i)总结金额和(ii) join每个ParentAccount的唯一账户(iii) join每个账户的金额家长帐户:

out = (df
       .groupby(['ParentAccount','Account'])
       .sum()
       .reset_index(level=1)
       .groupby(level=0)
       .agg(**{'Amount-Net': ('Amount','sum'), 
               'Account-Net': ('Account', lambda x: ', '.join(x)) , 
               'Amounts per Account': ('Amount', lambda x: ', '.join(x.astype(str)))}))

Output: Output:

               Amount-Net Account-Net Amounts per Account
ParentAccount                                            
1                    4.75        A, C           3.75, 1.0
2                   24.50        D, E          14.0, 10.5    

Use a double groupby :使用双groupby

out = (
    df.groupby(['ParentAccount', 'Account'], as_index=False)['Amount'].sum()
      .groupby('ParentAccount', as_index=False)
      .agg(**{'Amount-Net': ('Amount', 'sum'),
              'Amount-Detail': ('Amount', lambda x: ','.join(x.astype(str))), 
              'Account-Net': ('Account', ','.join)})
)

Output: Output:

>>> out
   ParentAccount  Amount-Net Amount-Detail Account-Net
0              1        4.75      3.75,1.0         A,C
1              2       24.50     14.0,10.5         D,E

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM