I have a dataframe as follows:
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame({'key1' : ['a','a','b','b','a'], 'key2' : ['b', 'b', 'b', 'a', 'b'], 'val' : np.random.randint(10, size=5)})
>>> df
key1 key2 val
0 a b 9
1 a b 8
2 b b 2
3 b a 2
4 a b 1
I am trying to get the total sum of the val column where either key1=='a' or key2=='a'. Here is what I have:
>>> total = (df[(df['key1']=='a') | (df['key2']=='a')]).sum()
>>> total
key1 aaba
key2 bbab
val 20
dtype: object
I have two questions:
df.loc[(df['key1']=='a') | (df['key2']=='a'), 'val'].sum()
# out
# 20
cols = ['key1','key2']
df.loc[df[cols].eq('a').any(1), 'val'].sum()
# same out
# 20
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.