如何在任何行在 Python pandas dataframe 中具有 NaN 值后删除列

Question

Toy example code玩具示例代码

Let's say I have following DataFrame :假设我有以下DataFrame ：

import pandas as pd
import numpy as np
df = pd.DataFrame({"A":[11,21,31], "B":[12,22,32], "C":[np.nan,23,33], "D":[np.nan,24,34], "E":[15,25,35]})

Which would return:哪个会返回：

>>> df
    A   B     C     D   E
0  11  12   NaN   NaN  15
1  21  22  23.0  24.0  25
2  31  32  33.0  34.0  35

Remove all columns with `nan` values删除所有具有`nan`值的列

I know how to remove all the columns which have any row with a nan value like this:我知道如何删除所有具有nan值的行的列，如下所示：

out1 = df.dropna(axis=1, how="any")

Which returns:哪个返回：

>>> out1
    A   B   E
0  11  12  15
1  21  22  25
2  31  32  35

Expected output预计 output

However what I expect is to remove all columns after a nan value is found.但是，我希望在找到nan值后删除所有列。 In the toy example code the expected output would be:在玩具示例代码中，预期的 output 将是：

Question题

How can I remove all columns after a nan is found within any row in a pandas DataFrame ?在pandas DataFrame的任何行中找到nan后，如何删除所有列？

Answer 1

What I would do:我会做什么：

check every element for being null/not null检查每个元素是否为 null/not null
cumulative sum every row across the columns各列中每一行的累计和
check any for every column, across the rows跨行检查每一列的any
use that result as an indexer:使用该结果作为索引器：

df.loc[:, ~df.isna().cumsum(axis=1).any(axis=0)]

Give me:给我吗：

Answer 2

I could find a way as follows to get the expected output:我可以找到如下方法来获得预期的 output：

colFirstNaN = df.isna().any(axis=0).idxmax() # Find column that has first NaN element in any row
indexColLastValue = df.columns.tolist().index(colFirstNaN) -1
ColLastValue = df.columns[indexColLastValue]
out2 = df.loc[:, :ColLastValue]

And the output would be then:然后 output 将是：

如何在任何行在 Python pandas dataframe 中具有 NaN 值后删除列

问题描述

Toy example code玩具示例代码

Remove all columns with `nan` values删除所有具有`nan`值的列

Expected output预计 output

Question题

2 个解决方案

解决方案1
4 已采纳 2020-10-09 16:52:13

解决方案2
0 2020-10-09 16:43:36

如何在任何行在 Python pandas dataframe 中具有 NaN 值后删除列

问题描述

Toy example code玩具示例代码

Remove all columns with nan values删除所有具有nan值的列

Expected output预计 output

Question题

2 个解决方案

解决方案1 4 已采纳 2020-10-09 16:52:13

解决方案2 0 2020-10-09 16:43:36

Remove all columns with `nan` values删除所有具有`nan`值的列

解决方案1
4 已采纳 2020-10-09 16:52:13

解决方案2
0 2020-10-09 16:43:36