[英]How to remove rows from a DataFrame where some columns only have zero values
I have the following Pandas DataFrame in Python:我在 Python 中有以下 Pandas DataFrame:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.array([[1, 2, 3, 4, 5, 6], [11, 22, 33, 44, 55, 66],
[111, 222, 0, 0, 0, 0], [1111, 0, 0, 0, 0, 0]]),
columns=['a', 'b', 'c', 'd', 'e', 'f'])
DataFrame looks as the following in a table: DataFrame 在表格中如下所示:
a b c d e f
0 1 2 3 4 5 6
1 11 22 33 44 55 66
2 111 222 0 0 0 0
3 1111 2222 0 0 0 0
The original DataFrame is much bigger than this.原来的DataFrame比这个大很多。 As seen, some rows have zero values in some columns (c, d, e, f).
如所见,某些行在某些列(c、d、e、f)中具有零值。
I need to remove these columns from the DataFrame so that my new DataFrame will look as the following (after removing rows where given columns are zeros only):我需要从 DataFrame 中删除这些列,这样我的新 DataFrame 将如下所示(在删除给定列仅为零的行之后):
a b c d e f
0 1 2 3 4 5 6
1 11 22 33 44 55 66
And I only need to remove the rows where all these column (c, d, e, and f) are zeros.我只需要删除所有这些列(c、d、e 和 f)都为零的行。 If, for example, 2 of them are 0, then I will not remove such rows.
例如,如果其中 2 个是 0,那么我不会删除这些行。
Is there a good way of doing this operation without looping through the DataFrame?有没有不循环通过 DataFrame 来执行此操作的好方法?
try this,尝试这个,
df[~df[list('cdef')].eq(0).all(axis = 1)]
a b c d e f
0 1 2 3 4 5 6
1 11 22 33 44 55 66
Row filtering on selected columns, any have zeroes with any
:对选定列进行行过滤,其中 any 与
any
都为零:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.array([[1, 2, 3, 4, 5, 6], [11, 22, 33, 44, 55, 66],
[111, 222, 0, 0, 0, 0], [1111, 0, 0, 0, 0, 0]]),
columns=['a', 'b', 'c', 'd', 'e', 'f'])
df = df[(df[['c', 'd', 'e', 'f']] != 0).any(axis=1)]
print(df)
Output: Output:
a b c d e f
0 1 2 3 4 5 6
1 11 22 33 44 55 66
Here is one more option: Use df.query()
with an self defined query.这是另一种选择:将
df.query()
与自定义查询一起使用。
my_query = '~('+'and '.join([f'{name}==0' for name in 'cdef'])+')'
df.query(my_query)
If you print my_query
it is easy to read: ~(c==0 and d==0 and e==0 and f==0)
with ~
means 'not'.如果您打印
my_query
,则很容易阅读: ~(c==0 and d==0 and e==0 and f==0)
with ~
表示“不”。
with operators与运营商
df.loc[~((((df['c'] == 0) & (df['d'] == 0)) & (df['e'] == 0)) & (df['f'] == 0))]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.