[英]Counting True / False values in unique columns per rows in Pandas
I am new to Pandas我是熊猫的新手
My DataFrame:我的数据帧:
df df
A B C 1 2 3 4 5 6 7 8 9
5 2 4 True False False True False True False True False
2 2 1 True True False False False True False True False
5 4 7 False False True False True True False True True
4 4 1 False True False False False True False True True
2 0 8 False False True False True True False True True
My goal:我的目标:
To calculate sum per cateogory 1-9
and columns A
, B
, C
.计算每个类别1-9
和A
、 B
、 C
列的总和。
So that I could answer these kidn of questions:这样我就可以回答这些小问题:
What is the sum
of column A
values where column 1
is True
, what is the sum
of C
where column 5
is True
.第1
True
A
列值的sum
是多少,第5
True
C
的sum
是多少。
In reality, I have about 50 categories 1-50
and I want to know if there is a smart way of calculating these sums without having to have this kind of line 50 times:实际上,我有大约 50 个类别1-50
,我想知道是否有一种聪明的方法来计算这些总和而不必使用这种行 50 次:
df['Sum of A where 1 is True'] = df.A.where(df.1)).sum()
and so on.等等。
Thank you for your suggestions.谢谢你的建议。
You can use DataFrame.melt
with filtering by True
s with DataFrame.pop
for extract column and then aggregate sum
:您可以使用DataFrame.melt
通过True
s 和DataFrame.pop
过滤来提取列,然后聚合sum
:
df = (df.melt(['A','B','C'], var_name='Type', value_name='mask')
.loc[lambda x: x.pop('mask')]
.groupby('Type')
.sum())
print (df)
A B C
Type
1 7 4 5
2 6 6 2
3 7 4 15
4 5 2 4
5 7 4 15
6 18 12 21
8 18 12 21
9 11 8 16
IIUC, this is just matmul
: IIUC,这只是matmul
:
# replace your columns accordingly
df[list('123456789')].T @ df[list('ABC')]
Output:输出:
A B C
1 7 4 5
2 6 6 2
3 7 4 15
4 5 2 4
5 7 4 15
6 18 12 21
7 0 0 0
8 18 12 21
9 11 8 16
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.