简体   繁体   中英

How to sum specific rows of pandas columns

I have following data:

    W   X   Y   Z   Pnl
A   1   0   0   0   25    
B   1   1   0   0   34    
C   1   0   0   0   -15    
D   0   0   0   1   2    
E   0   1   0   0   88    
F   1   0   0   0   -46

I would like the following output:

W   -2  # =25+34-15-46
X   122    
Y   0    
Z   2

Use DataFrame.pop for extract column, so possible multiple by DataFrame.mul all columns ( Pnl is removed by pop ), last sum per rows by DataFrame.sum :

df = df.mul(df.pop('Pnl'), axis=0).sum() 
print (df)
W     -2
X    122
Y      0
Z      2
dtype: int64

Solve this by performing broadcasted multiplication on the first 4 columns, then sum the rows:

df.iloc[:,:-1].mul(df['Pnl'], axis=0).sum()

W     -2
X    122
Y      0
Z      2
dtype: int64

Where,

df.iloc[:,:-1].mul(df['Pnl'], axis=0)

    W   X  Y  Z
A  25   0  0  0
B  34  34  0  0
C -15   0  0  0
D   0   0  0  2
E   0  88  0  0
F -46   0  0  0

You can also use df.mul(df.pop('Pnl'), axis=0).sum() but beware that pop destructively modifies df , avoid if you need to preserve the input.


If performance is important, use numpy :

# <0.24 versions 
(df.pop('Pnl').values[:,None] * df.values).sum(axis=0)
# v0.24 onwards
(df.pop('Pnl').to_numpy()[:,None] * df.to_numpy()).sum(axis=0)
# array([ -2, 122,   0,   2])

pd.Series((df.pop('Pnl').to_numpy()[:,None] * df.to_numpy()).sum(axis=0),
          index=df.columns)

W     -2
X    122
Y      0
Z      2
dtype: int64
pnl = df[['Pnl']]
df.drop(['Pnl'],axis=1,inplace=True)
res = pd.DataFrame(df.values * pnl.values,columns=df.columns)
final_res= res.sum(axis=0)

output:

W     -2
X    122
Y      0
Z      2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM