[英]Pandas Sum & Count Across Only Certain Columns
I have just started learning pandas, and this is a very basic question. 我刚刚开始学习大熊猫,这是一个非常基本的问题。 Believe me, I have searched for an answer, but can't find one.
相信我,我已经找到了答案,但找不到答案。
Can you please run this python code? 你能不能运行这个python代码?
import pandas as pd
df = pd.DataFrame({'A':[1,0], 'B':[2,4], 'C':[4,4], 'D':[1,4],'count__4s_abc':[1,2],'sum__abc':[7,8]})
df
How do I create column 'count__4s_abc' in which I want to count how many times the number 4 appears in just columns AC? 如何创建列'count__4s_abc',其中我想计算数字4在AC列中出现的次数? (While ignoring column D.)
(忽略列D.)
How do I create column 'sum__abc' in which I want to sum the amounts in just columns AC? 如何创建列'sum__abc',其中我想在仅AC列中对金额求和? (While ignoring column D.)
(忽略列D.)
Thanks much for any help! 非常感谢您的帮助!
Using drop
使用
drop
df.assign(
count__4s_abc=df.drop('D', 1).eq(4).sum(1),
sum__abc=df.drop('D', 1).sum(1)
)
Or explicitly choosing the 3 columns. 或明确选择3列。
df.assign(
count__4s_abc=df[['A', 'B', 'C']].eq(4).sum(1),
sum__abc=df[['A', 'B', 'C']].sum(1)
)
Or using iloc
to get first 3 columns. 或使用
iloc
获得前3列。
df.assign(
count__4s_abc=df.iloc[:, :3].eq(4).sum(1),
sum__abc=df.iloc[:, :3].sum(1)
)
All give 所有给予
A B C D count__4s_abc sum__abc
0 1 2 4 1 1 7
1 0 4 4 4 2 8
One additional option: 另外一个选择:
In [158]: formulas = """
...: new_count__4s_abc = (A==4)*1 + (B==4)*1 + (C==4)*1
...: new_sum__abc = A + B + C
...: """
In [159]: df.eval(formulas)
Out[159]:
A B C D count__4s_abc sum__abc new_count__4s_abc new_sum__abc
0 1 2 4 1 1 7 1 7
1 0 4 4 4 2 8 2 8
DataFrame.eval()
method can (but not always) be faster compared to regular Pandas arithmetic 与常规Pandas算法相比,
DataFrame.eval()
方法可以(但不总是)更快
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.