对某些列进行groupby和panda求和，同时还包括其他列

Question

I have the following data: 我有以下数据：

   import pandas as pd
x4 = pd.DataFrame({"ID": [101,101, 102, 103, 104, 105],
                   "Prob": [1, 1,1, 1, 1, 1],
                   "Ef": [0,2, 0, 0, 0.25, 0.29],
                   "W": [2, 2,3, 4, 5, 6],
                   "EC": [0, 0,0, 0, 1.6, 2],
                   "Rand": [11, 12,12, 13, 14, 15]})

I would like get the sum(Prob * Ef) by ID and then keep only the columns ID , the column with the sum , the EC column and the W column. 我想by ID获取sum(Prob * Ef) ，然后仅保留ID列，具有sum的列， EC列和W列。

So in the end I want to have this: 所以最后我想要这个：

            ID  sum_column EC       W
1:          101 2.00       0.0      2
2:          101 2.00       0.0      2
3:          102 0.00       0.0      3
4:          103 0.00       0.0      4
5:          104 0.25       1.6      5
6:          105 0.29       2.0      6

I have tried this: x4.loc[:, ['EC','W','ID','Prob','Ef']].groupby('ID').sum(Prob*Ef) 我已经试过了： x4.loc[:, ['EC','W','ID','Prob','Ef']].groupby('ID').sum(Prob*Ef)

But it does not work 但这不起作用

Answer 1

Use GroupBy.transform by multiplied columns: 通过多列使用GroupBy.transform ：

x4['sum_column'] = x4['Prob'].mul(x4['Ef']).groupby(x4['ID']).transform('sum')
x4 = x4.drop(['Ef','Prob', 'Rand'], axis=1)
print (x4)
    ID  W   EC  sum_column
0  101  2  0.0        2.00
1  101  2  0.0        2.00
2  102  3  0.0        0.00
3  103  4  0.0        0.00
4  104  5  1.6        0.25
5  105  6  2.0        0.29

If order of columns is important use insert : 如果列的顺序很重要，请使用insert ：

x4.insert(1, 'sum_column',  x4['Prob'].mul(x4['Ef']).groupby(x4['ID']).transform('sum'))
x4 = x4.drop(['Ef','Prob', 'Rand'], axis=1)
print (x4)
    ID  sum_column  W   EC
0  101        2.00  2  0.0
1  101        2.00  2  0.0
2  102        0.00  3  0.0
3  103        0.00  4  0.0
4  104        0.25  5  1.6
5  105        0.29  6  2.0

对某些列进行groupby和panda求和，同时还包括其他列

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-10-04 09:13:12

对某些列进行groupby和panda求和，同时还包括其他列

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-10-04 09:13:12

解决方案1
2 已采纳 2017-10-04 09:13:12