Multidimensional boolean indexing of Pandas DataFrame - remove NaN rows and columns

Question

I have a Pandas DataFrame like

df = pd.DataFrame([[1,-2,-3],[4,5,6],[1,3,4]])

which looks like

I would like to get a subset of this DataFrame with only negative values

    1    2
0  -2   -3

I would like to try boolean indexing (but I don't see how to use 2 dimensional boolean indexing)

In [7]: df_flag = df < 0
In [8]: df_flag
Out[8]:
       0      1      2
0  False   True   True
1  False  False  False
2  False  False  False

So I did

In [15]: df[df_flag]
Out[15]:
    0   1   2
0 NaN  -2  -3
1 NaN NaN NaN
2 NaN NaN NaN

Isn't there a way to (automatically) remove columns and rows full of NaN when using 2 dimensional boolean indexing ?

Answer 1

You can make 2 calls to dropna , dropna accepts a thresh param which won't drop the entire axis if there are n non-Na values so the following drops rows then columns:

In [283]:

df[df<0].dropna(axis=0, thresh=1).dropna(axis=1)
Out[283]:
   1  2
0 -2 -3

The result of the first dropna :

In [284]:

df[df<0].dropna(axis=0, thresh=1)
Out[284]:
    0  1  2
0 NaN -2 -3

UPDATE

the axis param accepts multiple args so in fact you can do it a single call, thanks @scls:

In [285]:

df[df<0].dropna(axis=[0,1], thresh=1)
Out[285]:
   1  2
0 -2 -3

Multidimensional boolean indexing of Pandas DataFrame - remove NaN rows and columns

Question

1 answers

solution1
1 ACCPTED 2015-04-30 09:26:36

Multidimensional boolean indexing of Pandas DataFrame - remove NaN rows and columns

Question

1 answers

solution1 1 ACCPTED 2015-04-30 09:26:36

solution1
1 ACCPTED 2015-04-30 09:26:36