[英]Filter DataFrame in Pandas on sum of rows
I have a dataframe 我有一个数据框
[1] df
ProductIds A B C D
11210000018 0 0 0 0
11210000155 1 0 0 0
11210006508 0 0 0 0
11210007253 0 0 0 0
11210009431 0 0 0 0
11210135871 1 0 0 0
I want to filter the frame by adding each row and if sum is greater than zero then filter that row. 我想通过添加每一行来过滤框架,如果sum大于零,则过滤该行。 For the given condition the result would be like
对于给定的条件,结果将是
ProductIds A B C D
11210000155 1 0 0 0
11210135871 1 0 0 0
One way of doing that is to add another column with sum and then filter like the following: 一种方法是用sum添加另一列,然后按如下所示进行过滤:
df['Sum'] = df.sum(axis = 1)
df = df[df.Sum > 0]
df.drop(['Sum']
But is there any one liner builtin method to do this ? 但是,有一种内置的班轮方法可以做到这一点吗? I cannot add the columns manually because there are thousands of columns.
我无法手动添加列,因为有数千列。 Thanks.
谢谢。
I think you can use DataFrame.all
if in DataFrame
are only 0
and numbers higher as 0
- test if in row are all values 0
and then use boolean indexing
: 我想你可以使用
DataFrame.all
如果DataFrame
只有0
和数字更高0
-测试如果行的所有值0
,然后使用boolean indexing
:
mask = (df == 0).all(axis=1)
print (mask)
ProductIds
11210000018 True
11210000155 False
11210006508 True
11210007253 True
11210009431 True
11210135871 False
dtype: bool
print (df[~mask])
A B C D
ProductIds
11210000155 1 0 0 0
11210135871 1 0 0 0
More general solution is use boolean mask
in boolean indexing
- is not neccessary create new column: 更通用的解决方案是在
boolean indexing
使用boolean mask
-不必创建新列:
df = df[df.sum(axis = 1) > 0]
another solutions: 另一个解决方案:
In [194]: df.query('A + B + C + D > 0')
Out[194]:
A B C D
ProductIds
11210000155 1 0 0 0
11210135871 1 0 0 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.