简体   繁体   中英

Multidimensional boolean indexing of Pandas DataFrame - remove NaN rows and columns

I have a Pandas DataFrame like

df = pd.DataFrame([[1,-2,-3],[4,5,6],[1,3,4]])

which looks like

   0  1  2
0  1 -2 -3
1  4  5  6
2  1  3  4

I would like to get a subset of this DataFrame with only negative values

    1    2
0  -2   -3

I would like to try boolean indexing (but I don't see how to use 2 dimensional boolean indexing)

In [7]: df_flag = df < 0
In [8]: df_flag
Out[8]:
       0      1      2
0  False   True   True
1  False  False  False
2  False  False  False

So I did

In [15]: df[df_flag]
Out[15]:
    0   1   2
0 NaN  -2  -3
1 NaN NaN NaN
2 NaN NaN NaN

Isn't there a way to (automatically) remove columns and rows full of NaN when using 2 dimensional boolean indexing ?

You can make 2 calls to dropna , dropna accepts a thresh param which won't drop the entire axis if there are n non-Na values so the following drops rows then columns:

In [283]:

df[df<0].dropna(axis=0, thresh=1).dropna(axis=1)
Out[283]:
   1  2
0 -2 -3

The result of the first dropna :

In [284]:

df[df<0].dropna(axis=0, thresh=1)
Out[284]:
    0  1  2
0 NaN -2 -3

UPDATE

the axis param accepts multiple args so in fact you can do it a single call, thanks @scls:

In [285]:

df[df<0].dropna(axis=[0,1], thresh=1)
Out[285]:
   1  2
0 -2 -3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM