简体   繁体   English

如何在任何行在 Python pandas dataframe 中具有 NaN 值后删除列

[英]How to remove columns after any row has a NaN value in Python pandas dataframe

Toy example code玩具示例代码

Let's say I have following DataFrame :假设我有以下DataFrame

import pandas as pd
import numpy as np
df = pd.DataFrame({"A":[11,21,31], "B":[12,22,32], "C":[np.nan,23,33], "D":[np.nan,24,34], "E":[15,25,35]})

Which would return:哪个会返回:

>>> df
    A   B     C     D   E
0  11  12   NaN   NaN  15
1  21  22  23.0  24.0  25
2  31  32  33.0  34.0  35

Remove all columns with nan values删除所有具有nan值的列

I know how to remove all the columns which have any row with a nan value like this:我知道如何删除所有具有nan值的行的列,如下所示:

out1 = df.dropna(axis=1, how="any")

Which returns:哪个返回:

>>> out1
    A   B   E
0  11  12  15
1  21  22  25
2  31  32  35

Expected output预计 output

However what I expect is to remove all columns after a nan value is found.但是,我希望在找到nan值后删除所有列。 In the toy example code the expected output would be:在玩具示例代码中,预期的 output 将是:

    A   B
0  11  12
1  21  22
2  31  32

Question

How can I remove all columns after a nan is found within any row in a pandas DataFrame ?pandas DataFrame的任何行中找到nan后,如何删除所有列?

What I would do:我会做什么:

  1. check every element for being null/not null检查每个元素是否为 null/not null
  2. cumulative sum every row across the columns各列中每一行的累计和
  3. check any for every column, across the rows跨行检查每一列的any
  4. use that result as an indexer:使用该结果作为索引器:
df.loc[:, ~df.isna().cumsum(axis=1).any(axis=0)]

Give me:给我吗:

    A   B
0  11  12
1  21  22
2  31  32

I could find a way as follows to get the expected output:我可以找到如下方法来获得预期的 output:

colFirstNaN = df.isna().any(axis=0).idxmax() # Find column that has first NaN element in any row
indexColLastValue = df.columns.tolist().index(colFirstNaN) -1
ColLastValue = df.columns[indexColLastValue]
out2 = df.loc[:, :ColLastValue]

And the output would be then:然后 output 将是:

>>> out2
    A   B
0  11  12
1  21  22
2  31  32

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM