熊猫数据框按两列分组，并总结一列

Question

I have pandas dataframe in the following format: 我有以下格式的pandas数据框：

d = {'buyer_code': ['A', 'B', 'C', 'A', 'A', 'B', 'B', 'A', 'C'], 'dollar_amount': ['2240.000', '160.000', '300.000', '10920.000', '10920.000', '235.749', '275.000', '10920.000', '300.000']}
df = pd.DataFrame(data=d)
df

This is how my dataframe looks like: 这是我的数据框的样子：

    buyer_code  dollar_amount
0   A           2240.000
1   B           160.000
2   C           300.000
3   A           10920.000
4   A           10920.000
5   B           235.749
6   B           275.000
7   A           10920.000
8   C           300.000

I have used groupby to list each buyer and there corresponding dollar amounts. 我使用groupby列出了每个买家，并列出了相应的美元金额。

df.groupby(['buyer_code', 'dollar_amount']).size()

This is the result: 结果如下：

buyer_code  dollar_amount
A           10920.000        3
            2240.000         1
B           160.000          1
            235.749          1
            275.000          1
C           300.000          2
dtype: int64

Now I want dollarAmount multiplied by its count and then sum of all the amounts for each buyer. 现在，我希望将dollarAmount乘以其计数，然后再乘以每个购买者的所有金额之和。

Lets say for example buyer_code "A" should have (10920.000 * 3) + (2240.000 * 1)

The result should be something like this: 结果应该是这样的：

buyer_code  dollar_amount
A           35000
B           670.749
C           600.000

How can I get this output? 如何获得此输出？

Answer 1

Use groupby + aggregate sum : 使用groupby +总sum ：

df['dollar_amount'] = df['dollar_amount'].astype(float)
a = df.groupby('buyer_code', as_index=False).sum()
print (a)
  buyer_code  dollar_amount
0          A      35000.000
1          B        670.749
2          C        600.000

Answer 2

unstack your result, and then perform a matrix multiplication between the result and its columns with dot - unstack结果unstack ，然后在结果及其dot之间用dot进行矩阵乘法-

i = df.groupby(['buyer_code', 'dollar_amount']).size().unstack()
i.fillna(0).dot(i.columns.astype(float))

buyer_code
A    35000.000
B      670.749
C      600.000
dtype: float64

Or, 要么，

i.fillna(0).dot(i.columns.astype(float))\
         .reset_index(name='dollar_amount')

  buyer_code  dollar_amount
0          A      35000.000
1          B        670.749
2          C        600.000

This is alright if you're doing something else with the intermediate groupby result, necessitating the need for its computation. 如果您要对中间groupby结果执行其他操作，则需要进行计算，这没关系。 If not, a groupby + sum makes more sense here. 如果不是，则groupby + sum在这里更有意义。

熊猫数据框按两列分组，并总结一列

问题描述

2 个解决方案

解决方案1
3 已采纳 2017-12-29 05:56:39

解决方案2
2 2017-12-29 05:55:59

熊猫数据框按两列分组，并总结一列

问题描述

2 个解决方案

解决方案1 3 已采纳 2017-12-29 05:56:39

解决方案2 2 2017-12-29 05:55:59

解决方案1
3 已采纳 2017-12-29 05:56:39

解决方案2
2 2017-12-29 05:55:59