简体   繁体   English

如何对pandas列的特定行进行求和

[英]How to sum specific rows of pandas columns

I have following data: 我有以下数据:

    W   X   Y   Z   Pnl
A   1   0   0   0   25    
B   1   1   0   0   34    
C   1   0   0   0   -15    
D   0   0   0   1   2    
E   0   1   0   0   88    
F   1   0   0   0   -46

I would like the following output: 我想要以下输出:

W   -2  # =25+34-15-46
X   122    
Y   0    
Z   2

Use DataFrame.pop for extract column, so possible multiple by DataFrame.mul all columns ( Pnl is removed by pop ), last sum per rows by DataFrame.sum : 使用DataFrame.pop作为提取列,因此DataFrame.mul所有列可能有多个( Pnlpop删除), DataFrame.sum每行的最后总和:

df = df.mul(df.pop('Pnl'), axis=0).sum() 
print (df)
W     -2
X    122
Y      0
Z      2
dtype: int64

Solve this by performing broadcasted multiplication on the first 4 columns, then sum the rows: 通过对前4列执行广播乘法来解决此问题,然后对行求和:

df.iloc[:,:-1].mul(df['Pnl'], axis=0).sum()

W     -2
X    122
Y      0
Z      2
dtype: int64

Where, 哪里,

df.iloc[:,:-1].mul(df['Pnl'], axis=0)

    W   X  Y  Z
A  25   0  0  0
B  34  34  0  0
C -15   0  0  0
D   0   0  0  2
E   0  88  0  0
F -46   0  0  0

You can also use df.mul(df.pop('Pnl'), axis=0).sum() but beware that pop destructively modifies df , avoid if you need to preserve the input. 你也可以使用df.mul(df.pop('Pnl'), axis=0).sum()但要注意pop破坏性地修改df ,避免你需要保留输入。


If performance is important, use numpy : 如果性能很重要,请使用numpy

# <0.24 versions 
(df.pop('Pnl').values[:,None] * df.values).sum(axis=0)
# v0.24 onwards
(df.pop('Pnl').to_numpy()[:,None] * df.to_numpy()).sum(axis=0)
# array([ -2, 122,   0,   2])

pd.Series((df.pop('Pnl').to_numpy()[:,None] * df.to_numpy()).sum(axis=0),
          index=df.columns)

W     -2
X    122
Y      0
Z      2
dtype: int64
pnl = df[['Pnl']]
df.drop(['Pnl'],axis=1,inplace=True)
res = pd.DataFrame(df.values * pnl.values,columns=df.columns)
final_res= res.sum(axis=0)

output: 输出:

W     -2
X    122
Y      0
Z      2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM