简体   繁体   English

熊猫DataFrame的多维布尔索引-删除NaN行和列

[英]Multidimensional boolean indexing of Pandas DataFrame - remove NaN rows and columns

I have a Pandas DataFrame like 我有一个Pandas DataFrame

df = pd.DataFrame([[1,-2,-3],[4,5,6],[1,3,4]])

which looks like 看起来像

   0  1  2
0  1 -2 -3
1  4  5  6
2  1  3  4

I would like to get a subset of this DataFrame with only negative values 我想获得此DataFrame的一个子集,仅带有负值

    1    2
0  -2   -3

I would like to try boolean indexing (but I don't see how to use 2 dimensional boolean indexing) 我想尝试布尔索引(但我看不到如何使用二维布尔索引)

In [7]: df_flag = df < 0
In [8]: df_flag
Out[8]:
       0      1      2
0  False   True   True
1  False  False  False
2  False  False  False

So I did 所以我做了

In [15]: df[df_flag]
Out[15]:
    0   1   2
0 NaN  -2  -3
1 NaN NaN NaN
2 NaN NaN NaN

Isn't there a way to (automatically) remove columns and rows full of NaN when using 2 dimensional boolean indexing ? 使用二维布尔索引时,没有办法(自动)删除充满NaN的列和行吗?

You can make 2 calls to dropna , dropna accepts a thresh param which won't drop the entire axis if there are n non-Na values so the following drops rows then columns: 您可以对dropna进行2次调用, dropna接受thresh参数,如果存在n非Na值,则不会丢弃整个轴,因此以下代码将删除行,然后删除列:

In [283]:

df[df<0].dropna(axis=0, thresh=1).dropna(axis=1)
Out[283]:
   1  2
0 -2 -3

The result of the first dropna : 第一个dropna的结果:

In [284]:

df[df<0].dropna(axis=0, thresh=1)
Out[284]:
    0  1  2
0 NaN -2 -3

UPDATE UPDATE

the axis param accepts multiple args so in fact you can do it a single call, thanks @scls: axis参数接受多个参数,因此实际上您可以一次调用,谢谢@scls:

In [285]:

df[df<0].dropna(axis=[0,1], thresh=1)
Out[285]:
   1  2
0 -2 -3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM