I have such pandas.DataFrame()
object like this:
In [11]: df
Out[11]:
a b
0 0 1
1 0 1
2 0 0
3 0 0
4 1 1
[5 rows x 2 columns]
I want to delete the rows all filled with zeros: the rows with indexes 2 and 3 here.
Desired output:
In [12]: magic_func(df)
Out[12]:
a b
0 0 1
1 0 1
4 1 1
[3 rows x 2 columns]
df.loc[~(df == 0).all(axis=1)]
Demo:
In [92]: df = pd.DataFrame({'a':[0,0,0,0,1], 'b':[1,1,0,0,1]})
In [93]: df
Out[93]:
a b
0 0 1
1 0 1
2 0 0
3 0 0
4 1 1
[5 rows x 2 columns]
In [94]: (df == 0).all(axis=1)
Out[94]:
0 False
1 False
2 True
3 True
4 False
dtype: bool
In [95]: df.loc[~(df == 0).all(axis=1)]
Out[95]:
a b
0 0 1
1 0 1
4 1 1
[3 rows x 2 columns]
df[~df.isin([0]).all(axis=1)]
also works:
In [108]: df[~df.isin([0]).all(axis=1)]
Out[108]:
a b
0 0 1
1 0 1
4 1 1
but it may be slower for large dataframes:
In [106]: df2 = pd.concat([df]*10000)
In [109]: %timeit df2.loc[~(df2 == 0).all(axis=1)]
100 loops, best of 3: 5.19 ms per loop
In [110]: %timeit df2[~df2.isin([0]).all(axis=1)]
10 loops, best of 3: 50.2 ms per loop
isin
is useful when you need to test membership against a large set of values, but for only one value it isn't surprising df == 0
, being more direct, is faster.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.