简体   繁体   English

按行总和过滤Pandas中的DataFrame

[英]Filter DataFrame in Pandas on sum of rows

I have a dataframe 我有一个数据框

[1] df
ProductIds  A   B   C   D
11210000018 0   0   0   0
11210000155 1   0   0   0
11210006508 0   0   0   0
11210007253 0   0   0   0
11210009431 0   0   0   0
11210135871 1   0   0   0

I want to filter the frame by adding each row and if sum is greater than zero then filter that row. 我想通过添加每一行来过滤框架,如果sum大于零,则过滤该行。 For the given condition the result would be like 对于给定的条件,结果将是

ProductIds  A   B   C   D
11210000155 1   0   0   0
11210135871 1   0   0   0

One way of doing that is to add another column with sum and then filter like the following: 一种方法是用sum添加另一列,然后按如下所示进行过滤:

df['Sum'] = df.sum(axis = 1)
df = df[df.Sum > 0]
df.drop(['Sum']

But is there any one liner builtin method to do this ? 但是,有一种内置的班轮方法可以做到这一点吗? I cannot add the columns manually because there are thousands of columns. 我无法手动添加列,因为有数千列。 Thanks. 谢谢。

I think you can use DataFrame.all if in DataFrame are only 0 and numbers higher as 0 - test if in row are all values 0 and then use boolean indexing : 我想你可以使用DataFrame.all如果DataFrame只有0和数字更高0 -测试如果行的所有值0 ,然后使用boolean indexing

mask = (df == 0).all(axis=1)
print (mask)
ProductIds
11210000018     True
11210000155    False
11210006508     True
11210007253     True
11210009431     True
11210135871    False
dtype: bool

print (df[~mask])
             A  B  C  D
ProductIds             
11210000155  1  0  0  0
11210135871  1  0  0  0

More general solution is use boolean mask in boolean indexing - is not neccessary create new column: 更通用的解决方案是在boolean indexing使用boolean mask -不必创建新列:

df = df[df.sum(axis = 1) > 0]

another solutions: 另一个解决方案:

In [194]: df.query('A + B + C + D > 0')
Out[194]:
             A  B  C  D
ProductIds
11210000155  1  0  0  0
11210135871  1  0  0  0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM